Is GPT-5 being criticized for making no progress? The year-end report from Epoch refutes this: AI is advancing at breakneck speed, and ASI is getting closer.
[Introduction] The year - end summary of Epoch AI is here! Surprisingly, AI has not stagnated but accelerated.
Recently, Epoch AI has released quite a lot of content.
They tested several Chinese - language models with open - source weights on FrontierMath.
The result is that their highest scores at levels 1 - 3 lag behind the world's top AI models by about seven months.
On the more difficult level 4, almost all open - source Chinese large models scored zero.
The only model that scored was DeepSeek - V3.2 (Thinking). It answered one question correctly and got a score of 1/48 ≈ 2%.
Of course, although these open - source Chinese large models scored zero, foreign models also performed poorly.
Top models like GPT and Gemini score extremely high on traditional math tests (such as GSM - 8k, MATH). However, on FrontierMath, their accuracy rates are not high either.
However, as can be seen from the table, their performance is at least better than that of open - source Chinese models. The reason for this is not yet known.
The reason why all AI models perform poorly is that FrontierMath is not an ordinary benchmark. It is a test paper jointly designed by over 60 top experts in the field of mathematics and is endorsed by a Fields Medalist.
It is a real comprehensive math test paper, not a simple quiz of formula substitution and calculus calculation. Instead, it consists of original and difficult problems at the expert level, covering number theory, real analysis, algebraic geometry, category theory, etc., and even problems at the research level that take hours or even days to solve.
This also proves that in the face of truly difficult math problems, AI is not yet a "problem - solving machine" but more like a primary school student who occasionally stumbles upon the answer.
AI Evolution Has Accelerated Again
In addition, they also released the latest data insights, and the conclusion is quite surprising -
The growth of AI capabilities is faster than before!
They used a comprehensive indicator called Epoch Capabilities Index (ECI) to track the development trend of the capabilities of cutting - edge AI models.
The results show that since April 2024, the growth rate of AI capabilities has significantly accelerated - nearly twice as fast as before!
That is to say, in the past few years, the growth of AI capabilities has not been a steady upward curve. Instead, it suddenly started to soar at a certain point.
The reasons behind this are twofold: the reasoning models have become stronger, and more attention has been paid to reinforcement learning.
Many people think that the progress of AI has slowed down because there has not been a huge leap since the release of GPT - 4.
However, the data shows that in fact, the progress of AI has never stopped; only the direction and rhythm have changed. It has been accelerating in some core skills, such as reasoning ability, rather than relying on "larger models + more parameters".
Top 10 Insights of the Year
Just now, Epoch AI released a hardcore year - end review.
Throughout 2025, they released 36 data insights and 37 newsletters.
Among these 70 short AI surveys, which ones are the most popular?
Epoch AI has given us a year - end summary.
The following 10 surveys are the most popular among readers.
The first 5 are the most popular data insights.
1. The Cost of AI Inference Has Dropped Drastically
To be more precise, the inference price of large - language models (LLMs) has decreased rapidly but unevenly across different tasks.
From April 2023 to March 2025, Epoch AI observed that at the same performance level, the price per token decreased by more than 10 times.
That is to say, the price of each AI inference (outputting an answer) has decreased by more than 10 times.
The decreasing cost means that the popularization of AI will be more accessible: from now on, it is no longer a technology that only large companies can afford but a tool that everyone can use!
2. The AI "Brain" Is Moving into Your Computer
In just one year, the performance of cutting - edge AI has been achieved on consumer - grade hardware.
Currently, the top - notch open - source models that can run on consumer - grade GPUs perform well on multiple performance indicators such as GPQA, MMLU, AA Intelligence, and LMArena, and the gap between them and the top AI models is less than one year, or even shorter.
Since the most powerful open - source models can run on ordinary consumer - grade graphics cards, in the near future, your laptop may be able to run large AI models!
Moreover, any cutting - edge AI capabilities may be widely accessible to the public in less than a year.
3. Most of OpenAI's Computing Power in 2024 Was Actually Used for Experiments
Media reports show that in 2024, most of OpenAI's computing resources were not used for inference or training but for experiments to support further development.
Yes, it's not what you think: it's not just about training or providing 24/7 services to users. Instead, it involves a lot of trial - and - error, exploration, and experimentation.
This shows that current AI R & D still relies heavily on a large number of experiments, rather than just running a few benchmarks.
Meanwhile, most of the current AI costs come from experiments rather than training and deployment.
4. The Computing Power of NVIDIA Chips Doubles Every 10 Months!
Since 2020, the deployed AI computing power of NVIDIA chips has more than doubled every year.
Each time a flagship chip is released, it will account for the majority of the existing computing power within three years.
Therefore, it can be said that GPUs are still the core fuel for AI computing, and their growth rate is extremely fast.
To sustain the current pace of AI development, computing resources need to be doubled, which means that Huang (Jensen Huang) and other chip manufacturers still have a lot of business to do!
5. Both GPT - 4 and GPT - 5 Represent Significant Leaps
Although some people complain that OpenAI updates too quickly to notice the progress, don't believe them!
Both GPT - 4 and GPT - 5 have achieved significant leaps in benchmark tests, far surpassing the performance of their predecessors.
Therefore, this year's AI development is not just a stack of minor innovations but a real leap in capabilities.
Then why do many people feel disappointed after the release of GPT - 5?
This is because the frequency of new model releases in the past two years has been higher, rather than a slowdown in capabilities.
Top 5 Hottest Gradient Articles: Thoughts Behind the Insights
The next 5 are the most popular articles in the Gradient column.
Gradient is a column of Epoch AI dedicated to publishing short news updates.
6. Does ChatGPT Consume a Surprising Amount of Electricity? Not Really
What is the average energy consumption of each inference of GPT - 4o?
The answer is that it is less than the energy consumed by lighting a light bulb for five minutes.
This conclusion has also been confirmed by Altman and is similar to the energy cost per Gemini prompt reported by Google.
That is to say, the outside world's concerns about AI's energy consumption are actually more exaggerated than the actual situation.
Of course, AI's energy consumption has been growing exponentially and may become a major problem in the future.
7. How Did DeepSeek Improve the Transformer Architecture?
This article explains clearly the three core techniques used by DeepSeek v3 to become the strongest open - source model at that time with lower computing power.
The three techniques are multi - head latent attention (MLA), improvement of the mixture - of - experts (MoE) architecture, and multi - token prediction mechanism.
Three days after the release of this article, DeepSeek released R1, causing a huge stir in the global AI community. Its performance is comparable to that of OpenAI o1, but the development cost is only a fraction.
The entire AI community has learned an important lesson: ingenious architectural innovation = lower R & D costs + faster implementation speed.
8. How Far Can Inference Models Go? What Are the Limitations?
The author analyzed the growth pattern and upper limit of inference training. The conclusion is that although inference is indeed important, its growth will not