Google Chief Scientist Jeff Dean Clarifies Three Key AI Signals After Gemini 3

Gemini 3 shifts focus to efficiency and action, transitioning AI from answering questions to taking actions.

Gemini 3, another most powerful model has emerged.

But compared with the last time, what exactly has changed? Has the benchmark score increased by a few points, or is the AI truly different?

Right after the release of Gemini 3, on November 22nd, Jeff Dean gave a speech at Stanford University. The system reviewed the 15 - year evolution of AI, from neural networks, TPU, Transformer to sparse models and distillation techniques, and finally demonstrated the unique capabilities of Gemini 3.

During the speech, Jeff Dean didn't mention benchmark numbers, nor did he do product promotion. What he said was:

Why should AI work like the human brain?

Why should AI evolve from being able to talk to being able to act?

Why does the next - generation AI compete in efficiency rather than parameters?

In Jeff Dean's view, Gemini 3 is not just a larger model; it has completely changed the way AI is used.

These three judgments are the real signals behind this release.

Signal 1: From competing in size to emulating the brain

At the beginning of the speech, Jeff Dean pointed directly at a problem: traditional AI models are extremely wasteful.

He said:

"In traditional neural networks, you activate the entire model for each example, which is very wasteful. A better way is to have a very large model but only activate 1% to 5% of it each time."

Suppose you have a huge model that encompasses various capabilities such as image processing, language understanding, mathematics, and coding. The traditional approach is to call the entire model no matter what question you ask. It's like turning on all the electrical appliances in your home every time you turn on a light. Jeff Dean's idea is to run only the necessary part according to the task type. When processing images, only use the visual module; when writing code, only use the programming module.

He used the brain as an analogy: when you're in an English class, your brain uses the part that processes language; when you're driving, your brain shuts it down and focuses all its energy on controlling your body and observing the road conditions. AI models should work in the same way.

This idea wasn't proposed just for Gemini 3.

As early as a few years ago, Jeff Dean started pushing his team to conduct research in this direction and gave it a name: Pathways architecture. The core goal of this architecture is to build an extremely large model while maintaining high efficiency, so that only a small part of the paths are activated during each inference.

Google achieved this through the "Mixture of Experts" (MoE) technology:

There are many expert modules inside the model.

Some are good at images, some at language, and some at information fusion.

After the input data comes in, the system automatically decides which experts to call.

How effective is it? The data Jeff Dean presented in the speech was astonishing: with the same computing budget, the MoE architecture can train a model with 8 - fold improved performance.

And Gemini 3 is the latest implementation of this concept.

It no longer loads all the weights at once but calls the expert modules as needed.

The result is: stronger performance, lower cost, and the ability to handle multiple tasks simultaneously. Just like your brain: multiple regions work together when dealing with complex problems, and only a small part is used when doing simple things.

What does this transformation mean?

Future top - tier models will no longer be all - rounders that can do everything but professional teams with different specializations working together.

The key to AI competition has shifted from 'whose model is larger' to who can better call tools.

Signal 2: Why isn't AI just for answering questions?

If the first section talked about how the model becomes smarter internally, then the second section will talk about: it has started to do things for you.

At the speech site, Jeff Dean demonstrated an example: a user had a bunch of family recipes, some hand - written in Korean, some in English, all in old photos with creases and oil stains.

The user's requirement was simple: to create a bilingual recipe website.

Then, what did Gemini 3 do? Step 1: Scan and recognize the text in all the photos; Step 2: Translate it into a bilingual version; Step 3: Automatically generate the website layout; Step 4: Match each recipe with an AI - generated illustration.

Throughout the process, the user only said one sentence.

This is the difference between a traditional assistant and an intelligent agent. An assistant answers whatever you ask, while an agent takes your goal, breaks down the tasks by itself, calls tools, and completes the entire operation chain.

Jeff Dean said:

AI is not just about answering you but has the ability to take action.

The technological breakthrough behind this ability is: reinforcement learning in verifiable domains.

What does it mean?

Take programming as an example:

The AI generates a piece of code.
The system automatically checks: Can it be compiled?
If it can, give a reward; if not, give a penalty.
Furthermore: Has the code passed the unit test?
If it has, give more rewards.

The same logic also applies to mathematics:

The AI generates a proof.
The system verifies it with a proof checker.
If it's correct, give a reward; if wrong, point out which step is incorrect.

Jeff Dean said: This technological breakthrough enables the model to truly explore the space of potential solutions, and over time, it gets better at exploring this space.

How astonishing is the effect? Gemini solved five out of six questions in the 2025 International Mathematical Olympiad (IMO) and won a gold medal.

How shocking is this result?

Just three years ago in 2022, AI models were still very weak in mathematical reasoning.

At that time, the most advanced model in the industry only had a 15% accuracy rate on GSM8K (a middle - school mathematics benchmark test). How difficult were the test questions? For example: Sean has five toys. He got two more at Christmas. How many toys does he have now?

For such elementary arithmetic questions, the AI's correct rate was only 15% at that time.

Now, Gemini can solve the questions of the International Mathematical Olympiad, which are the most difficult problems in the global competition for mathematical geniuses.

It took less than three years to go from elementary arithmetic to an IMO gold medal.

This leap shows that AI has not only become stronger in answering questions but has real problem - solving abilities. It can explore, try, and verify by itself until it finds the correct answer.

Specifically, an agent needs three key abilities:

State awareness: Know what you want and understand the current progress.

Tool combination: Be able to call external tools such as search engines, calculators, and APIs.

Multi - step execution: Adjust the plan according to feedback and keep trying until completion.

And through deep integration with the Google ecosystem, Gemini 3 can connect real - world systems such as calendars, emails, and cloud services and truly utilize these abilities.

Just like the recipe website case mentioned earlier: you don't need to say "first recognize the text, then translate, then typeset", you just need to say "create a website", and Gemini 3 will handle all the steps by itself.

This has changed everyone's work style:

In the past, you had to tell the AI how to do each step.

Now, you just need to state your goal, and the AI will handle the rest.

Your role has changed from a user to a commander.

Signal 3: What determines whether AI can be popularized?

If the Pathways architecture makes the model smarter and the Agent system enables the model to take action, then the third signal is the most easily overlooked but perhaps the most crucial: making AI truly affordable.

Jeff Dean told a story from 2013 at Stanford.

At that time, Google had just developed a very good speech recognition model, with a much lower error rate than the existing systems. Jeff Dean made a calculation: what would happen if 100 million people started talking to their phones for 3 minutes every day?

The answer was: Google would need to double the number of its servers.

That is to say, an improvement in a function would come at the cost of doubling the company's server resources.

This made Jeff Dean realize: Having a good model is not enough; it must be affordable.

So, the TPU was born.

1. TPU: Hardware designed for efficiency

In 2015, the first - generation TPU was put into use. It was specifically designed for machine learning and did one thing: optimize low - precision linear algebra operations to the extreme.

What was the result?

It was 15 to 30 times faster than the CPUs and GPUs at that time and 30 to 80 times more energy - efficient.

This made it possible to launch functions that originally required doubling the servers with only a small part of the existing hardware.

With the latest seventh - generation Ironwood TPU, a single pod has 9,216 chips. Compared with the first - generation machine - learning super - computing pod (TPUv2), the performance has increased by 3,600 times, and the energy efficiency has increased by 30 times.

Jeff Dean specifically pointed out that these improvements are not only due to the progress of chip technology. More importantly, Google has made energy efficiency a core goal from the very beginning of the design.

2. Distillation: Enabling small models to learn the capabilities of large models

Hardware is one aspect, and algorithms are another.

Jeff Dean, Geoffrey Hinton, and Oriol Vinyals jointly studied a technology called "distillation".

The core idea is: let the large model be the teacher and teach the small model.

In a speech recognition task, they conducted an experiment:

Using 100% of the training data, the accuracy rate was 58.9%.
Using only 3% of the training data, the accuracy rate dropped to 44%.
But if using distillation, with only 3% of the data, the accuracy rate could reach 57%.

They achieved a result close to that of using 100% of the data with only 3% of the data.

Jeff Dean said:

"You can train a very large model and then use distillation to enable a much smaller model to achieve performance very close to that of the large model."

This is why Gemini can both lead in performance and be available on mobile phones. The large model is trained in the cloud, and the small model learns through distillation and is deployed on mobile phones. It has only one - tenth of the parameters but retains over 80% of the capabilities.

3. The real threshold: Whether it can be implemented under real - world constraints

But technological breakthrough is just the first step. Jeff Dean believes that for AI to be truly popularized globally, it must face more realistic problems: Is there enough energy? Is the power supply stable? Is the network accessible? Can the devices support it?

This is why Google is promoting AI in emerging markets such as Southeast Asia. These regions may not have a powerful power grid and server infrastructure, but through efficiency technologies such as TPU and distillation, people can still use AI under the existing conditions.

Google's strategy is not to wait for perfect conditions before promoting but to make the technology adapt to reality.

This underlying logic has changed the focus of the entire industry.

In the past, people compared:

How powerful is this model?
How many parameters does it have? How many tokens?

Now, what really matters is:

Can it be used on my device?
How low can the cost be reduced?
Can it be used offline?

In the next round of competition, it's not about parameters but about implementation efficiency.

Conclusion | From models to systems

Looking at the performance data, this is a model upgrade.

Looking at Jeff Dean's thinking, this is a paradigm shift.

From the dilemma of having to double the servers in 2013 to winning an IMO gold medal in 2025, Jeff Dean has always been answering one question:

How to make AI both powerful and usable?

The answer lies in three transformations:

It's not about whose model is larger but about smarter design (Pathways)

It's not about whose answers are more accurate but about being able to really do things (Agent)

It's not about whose model has more parameters but about making it accessible to more people (TPU + distillation)

Gemini 3 is not the end but the first complete demonstration of this systematic thinking.

📮 Original links:

https://www.youtube.com/watch?v=AnTw_t21ayE&t=921s

https://blockchain.news/ainews/key - ai - trends - and - deep - learning - breakthroughs - insights - from - jeff - dean - s - stanford - ai - club - talk - on - gemini - models

https://blog.google/products/gemini/gemini - 3/?utm_source=chatgpt.com

https://www.wired.com/story/google - launches - gemini - 3 - ai - bubble - search?utm_source=chatgpt.com

This article is from the WeChat official account "AI Deep Researcher", author: AI Deep Researcher, published by 36Kr with authorization.

该文观点仅代表作者本人，36氪平台仅提供信息存储空间服务。

After Gemini 3, Google Chief Scientist Jeff Dean clarified the three key signals of AI.

Signal 1: From competing in size to emulating the brain

Signal 2: Why isn't AI just for answering questions?

Signal 3: What determines whether AI can be popularized?

Conclusion | From models to systems

📮 Original links: