HomeArticle

Revealing the internal perspective of DeepMind: Scaling Law isn't dead, and computing power is everything.

新智元2025-12-31 20:43
As 2025 draws to a close, a Chinese researcher at DeepMind has penned a 10,000-word essay revealing Google's internal predictions about AI: apart from computing power, everything else is just noise.

Today is the last day of 2025. Many people are conducting AI reviews and summaries on this day.

After a year of being bombarded with news about models, computing power, and capital, how far is AI from artificial general intelligence (AGI)?

If 2024 was the year of people's curiosity about AI, then 2025 was the year when AI profoundly impacted human society.

In this year full of uncertainties, we heard very different voices:

Sam Altman boldly predicted in his blog post "The Gentle Singularity" in mid-2025:

"We already know how to build AGI. In 2026, we will see systems that can generate original insights." He firmly believes that the Scaling Law is far from reaching its ceiling, and the cost of intelligence will approach zero with the automated production of electricity.

Extended reading: Altman: The gentle singularity has arrived! AI will ultimately control the physical world, and there will be a major turning point in human destiny in 2030

Jensen Huang of NVIDIA shifted his focus from "computing power worship" to the "AI factory."

He mentioned in a speech at the end of 2025:

"The bottleneck of AI is no longer imagination but electricity. In the future, the Scaling Law will not only involve stacking models but also a 100,000-fold leap in inference efficiency."

Extended reading: NVIDIA's AI factory: An absolute necessity brewed over 12,000 years in human history

In contrast, Yann LeCun, the former chief scientist at Meta, still loves to talk. He even publicly stated before leaving to start a new company:

"Large language models (LLMs) are a dead end on the path to AGI. They have no world model, like a castle in the air without a body."

Extended reading: LeCun bets his later life that large models are doomed! Hassabis vows to continue with Scaling

In 2026, can the Scaling Law still hold up?

Regarding this question, a long article of tens of thousands of words by a Chinese researcher from DeepMind became popular on social media:

The Scaling Law is not dead! Computing power is still king, and AGI is just getting started.

Article link: https://zhengdongwang.com/2025/12/30/2025-letter.html

This article is the 2025 annual letter written by Zhengdong Wang, a researcher at Google DeepMind.

The author, from a unique personal perspective, reviewed the drastic changes in the AI field from 2015 to the present and profoundly analyzed the core driving force behind all this - computing power.

Although the outside world is skeptical about the Scaling Laws, history has repeatedly proven that with the exponential growth of computing power, AI models continuously demonstrate capabilities beyond human expectations.

The author, based on his work experience at DeepMind, verified the "bitter lesson" of Richard S. Sutton, the godfather of reinforcement learning:

General computing power methods will eventually defeat human-specific skills.

This is also our biggest feeling this year!

Other than computing power, everything else is just noise

On December 30, 2025, when looking back on this eventful year, what comes to mind is the visual revolution initiated by AlexNet fifteen years ago.

The conference in which Geoffrey Hinton, Fei-Fei Li, and Ilya Sutskever all participated might be the real origin of today's AI era.

Back then, most people thought that artificial intelligence was just a game of "feature engineering" and "human ingenuity." Today, we have entered a completely different dimension:

An era dominated by computing power, driven by the Scaling Law, and in which AGI is just setting off on its journey.

Recently, the focus of everyone's attention has been: Has the Scaling Law hit a wall?

The belief in computing power: Why the Scaling Law has never failed

At the end of 2024, there was a strong sense of pessimism in the industry, believing that the depletion of pre-training data and the diminishing marginal returns marked the end of the Scaling Law.

However, standing at the end of 2025, we can responsibly say:

The Scaling Law is not only alive but is undergoing a profound evolution from "brute-force parameter stacking" to "intelligence density."

Fifteen-year continuity

To understand the Scaling Law, we must first see its historical resilience.

Research shows that in the past fifteen years, the computing power used to train AI models has increased by a factor of four to five every year.

This exponential compound growth is rare in human technological history.

Inside DeepMind, it has been observed that the amount of mathematical operations consumed during the training of models has exceeded the number of stars in the observable universe.

This growth is not blind but is based on extremely stable empirical formulas.

According to the empirical research of Kaplan, Hoffmann, and others, there is a clear power-law relationship between performance and computing power: Performance improvement is proportional to the 0.35th power of computing power.

Article link: https://fourweekmba.com/ai-compute-scaling-the-50000x-explosion-2020-2025/

This means that for every tenfold increase in computing power, there is approximately a threefold increase in performance. When we cross a 1000-fold computing power gap, the performance improvement will reach an astonishing tenfold.

Qualitative leap and emergent capabilities

The most fascinating aspect of the Scaling Law is that it not only brings about a quantitative reduction in errors but also induces unpredictable qualitative leaps.

In DeepMind's experiments, as the computing power increases, models suddenly demonstrate "emergent capabilities" such as logical reasoning, following complex instructions, and factual correction.

This phenomenon means that computing power is not just fuel but is itself a physical quantity that can give rise to intelligence.

The truth in 2025 is that we have shifted from simple "pre-training Scaling" to "Scaling in all four dimensions":

  1. Pre-training Scaling

Build basic cognition through massive amounts of multimodal data.

  1. Post-training Scaling

Use reinforcement learning (RL) for alignment and preference optimization.

  1. Inference Scaling

Let the model "think longer" before answering.

  1. Context Scaling

Improve end-to-end task capabilities through long-term memory.

The "1000-fold computing power" moment experienced at DeepMind

If the Scaling Law is a macro philosophy, then the experiment Zhengdong Wang experienced at DeepMind in 2021 was a micro revelation.

That experience completely reshaped Zhengdong Wang's view of intelligence and made him understand why "computing power is king."

The devaluation of algorithmic ingenuity

At that time, the DeepMind team was trying to solve the problem of navigation and interaction in a 3D virtual environment for embodied AI.

It was a typical "hardcore AI" challenge that involved optimizing complex reinforcement learning algorithms.

At that time, the consensus was that the bottleneck of this problem lay in the sophistication of the algorithm, specifically in how to design better sampling strategies and reward functions.

However, a colleague proposed an almost "reckless" solution: Don't change the algorithm; just increase the computing power input by a thousand times.

After that computing power surge, a miracle happened!

Those logical dead ends that were originally thought to require breakthrough human ingenuity to solve simply "melted" in the face of massive matrix multiplications.

The algorithm didn't become smarter, but scale gave it a kind of robustness similar to biological instincts.

The impact of the computing power wave

At that moment, Zhengdong Wang deeply understood the truth expressed by Richard Sutton in "The Bitter Lesson":

Human so-called "ingenuity" in the field of AI is often insignificant in the face of the exponential growth of computing power.

This realization is like a huge "computing power wave" rolling over you, making you realize that instead of racking your brains to optimize 1% of the algorithm's efficiency, it's better to embrace a 1000-fold expansion of computing power.

This perspective has become the common language within DeepMind in 2025:

We no longer ask "Can this problem be solved?" but "How much computing power is needed to solve this problem?"

It is this mindset that allows us to dare to invest far more money in data centers than the Apollo program.

The limits and challenges of infrastructure: The arrival of the 1GW era

Zhengdong Wang also provides an additional perspective.

When DeepMind discusses computing power internally, the topic has shifted from "PFLOPS" to "GW."

In 2025, AI is no longer just code; it is heavy industry, the ultimate integration of land, energy, and custom silicon chips.

The generational leap in hardware: Blackwell and Ultra

This ultimate integration can be summarized in one word: the "AI factory," a concept proposed by Jensen Huang at the GTC conference.

Wang believes that the Blackwell platform delivered by NVIDIA in 2025 is the physical foundation for DeepMind to maintain its belief in the Scaling Law.

The GB200 NVL72 system interconnects 72 GPUs into a single supercomputing engine, and its inference speed for trillion-parameter models is 30 times faster than that of the H100.

The launch of the Blackwell Ultra has pushed the single-chip memory to the limit of 288GB, which means that even models larger than 300B can reside entirely without memory offloading, which is crucial for long context and high-concurrency inference.

The hard walls of power and heat dissipation

However, the laws of physics are still strict.

As the single-chip power consumption approaches 1000W, DeepMind has had to switch entirely to liquid cooling solutions.

In 2025, people started talking about "AI factories" instead of "data centers."

Amin Vahdat, the chief infrastructure officer at Google, clearly pointed out in an internal meeting that to meet the explosive demand for computing power, we must double our computing power capacity every six months and achieve a 1000-fold increase in the next 4 - 5 years.