The person in charge of Gemini 3 pre-training warns: The model war has shifted from algorithms to engineering. Synthetic data has become the core of generational leap. The secret weapon for Google to crush OpenAI and Meta is exposed.
By the end of 2025, the "year - end battle" in the large - model industry officially kicked off. Each company unveiled its trump cards. Amid this fierce competition, Gemini 3 broke through powerfully in an absolute dominant position. As soon as it made its debut, it refreshed the industry's cognitive boundaries.
On November 18th, Gemini 3 "swept" multiple authoritative benchmark tests. It crushed all similar global models with the stances of "the world's most powerful multimodal understanding", "the agent with the deepest interaction", and "the reasoning monster". Google CEO Sundar Pichai personally endorsed it, stating bluntly that it is "the most intelligent model to date". As soon as the news came out, the entire AI circle was instantly in an uproar. Everyone was asking: What exactly are the secrets behind Gemini 3's strength?
There were initial clues on the release day. Oriol Vinyals, the vice - president of research and deep learning at Google DeepMind, directly "spoiled the beans" on Twitter: "The core secrets of Gemini 3's strength are two points: better pre - training and better post - training." This straightforward statement instantly made "pre - training" and "post - training" the core topics of hot discussion in the industry.
Recently, Sebastian Borgeaud, one of the pre - training leaders of Gemini 3 and a co - author of the groundbreaking paper RETRO made his first appearance on a podcast and deeply dissected the laboratory logic behind this top - tier model. In his view, the leap of Gemini 3 is not a breakthrough in a single aspect but the result of continuous optimization of countless details: "We can almost find ways to make the model better every day, and the whole team is moving forward at an accelerated pace."
More importantly, Sebastian Borgeaud pointed out a core transformation: Google has stopped simply "building models" and has shifted to "building systems." This view coincides with that of Demis Hassabis, the co - founder and CEO of DeepMind. Hassabis previously publicly stated that the strength of Gemini 3 stems from the in - depth integration of "research, engineering, and infrastructure."
The secrets of Gemini 3 actually reflect the profound transformation in the current industry: AI has officially entered a new stage of "limited data" from the large - scale era of "unlimited data." This trend is irreversible and forces the entire industry to rethink the innovation direction. In Sebastian Borgeaud's view, synthetic data, reasoning traces, long context, continuous learning, end - to - end retrieval training, and a reliable evaluation system will together constitute the future evolutionary path of the AI industry.
Actually, as early as in the classic Chinchilla project, the DeepMind team had grasped the key rule: Given a fixed training computational volume, instead of blindly expanding the model scale, it is better to expand the data scale more quickly, which can train a better model. This conclusion still holds great practical significance today. It directly determines the efficiency of reasoning services after model training and the usage cost, which is one of the core considerations for enterprises to implement AI.
As a senior researcher who has shifted from reinforcement learning to representation learning, Sebastian Borgeaud has profound pre - training skills. From the Transformer architecture, to BERT, XLNet, and then to Gopher, the first large - language - model paper of DeepMind, his rich research experience has formed his unique "research taste", which also laid the foundation for the pre - training breakthrough of Gemini 3.
In response to the controversy in the industry about "the pre - training Scaling Law is dead", Sebastian Borgeaud gave a clear response: "Scale still matters, but the weight of architectural innovation and data innovation has significantly increased and even become more crucial."
So, how to achieve better model effects under the background of limited data? Synthetic data has become a popular solution pursued by the industry, but Sebastian Borgeaud's attitude is quite cautious: "This is indeed an interesting direction, but we must be extremely careful."
In his view, the core risk of synthetic data is not "ineffectiveness" but "using it wrongly without even realizing it." Once the data distribution deviates, the model may seem to improve its question - answering ability, but it may fall into a self - indulgent closed - loop. Therefore, he proposed a reliable solution: After generating synthetic data with a powerful model, it is necessary to conduct small - scale, controllable ablation experiments to verify its benefits and potential side effects.
However, even so, a core question remains unresolved: "Can a model trained with synthetic data surpass its 'teacher'?"
It is worth mentioning that Google's model training initially integrated data from multiple sources, which also laid the foundation for the multimodal advantages of Gemini 3.
Sebastian Borgeaud also revealed that DeepMind is promoting the innovation of "post - Transformer architectures" and is very optimistic about "native models." Although the R & D cost of such models is high, the long - term value is worth investing in. In addition, regarding the trend of large - scale reinforcement learning that emerged this year, they also have rich pre - training experience that can be reused, forming a technological synergy effect.
In the second half of the podcast, Sebastian Borgeaud shifted the topic to the hotspots of the next round of pre - training. He believes that pre - training will no longer follow the single path of "bigger, longer, and more expensive." The focus will shift to architectural innovation:
Long context and the attention mechanism are the key variables. The longer the context, the more information the model can carry during reasoning, and the wider the boundary of the model's ability.
In the longer - term direction, it is to integrate retrieval and search more deeply into training, conduct end - to - end and differentiable learning, and enable the model to make "being able to retrieve" an endogenous ability rather than attaching external tools after going online. He predicts that the large - scale development of reinforcement learning may promote this process, but it will take several years to precipitate into a stable architecture and training paradigm.
Another main line is continuous learning. Sebastian Borgeaud said bluntly that once the pre - training of the base model is completed, its knowledge is basically fixed. If there are new papers or discoveries tomorrow, the model will not update itself. Currently, a more feasible method in the industry mainly occurs on the product reasoning side - accessing retrieval, pulling the latest information into the context in real - time, and then completing reasoning based on these materials, so as to avoid frequent retraining of the base and alleviate knowledge expiration.
This is consistent with the idea of the RETRO project he participated in, which stores knowledge in an external library and the model is responsible for reasoning. He believes that the retrieval - enhancement method has only become mature in recent years and is expected to penetrate more deeply into top - tier models like Gemini in the next few years. The more distant goal is to change the training method so that the model can be continuously trained on the real - world data stream to achieve "continuous update" in the true sense.
Sebastian Borgeaud also singled out evaluation, regarding it as the core challenge in the pre - training stage. "If the evaluation system fails to keep up, it is easy to fall into the false internal friction of 'seeming improvement' and it is impossible to tell whether the model is improved correctly or there is a problem with the data." That's why Google has built a dedicated evaluation system internally. After all, external benchmarks are easily contaminated, and it is crucial to retain an internal evaluation position.
He believes that evaluation needs to cross two gaps: First, whether the improvements verified on small models can be smoothly transferred to large - scale models; Second, whether the advantages in the pre - training stage can be transformed into real and usable capabilities after post - training.
Finally, service cost is also an unavoidable real - world constraint. As the user scale continues to expand, the reasoning budget becomes more and more sensitive. The pre - training stage must also be responsible for "going online and implementation." While improving the model's ability, it also needs to reduce costs and save resources.
Regarding the current performance of Gemini 3, Sebastian Borgeaud said bluntly that it is "beyond expectations." He believes that the model is truly becoming smarter. This progress is not only reflected in the top - ranking results in benchmark tests but also in the usage experience in real - world work scenarios.
Looking forward to the future, he predicts that Gemini will better serve scientific research and may even win the Nobel Prize by contributing to major discoveries. At the same time, it will also be more and more deeply integrated into ordinary people's lives to solve various practical problems.
"There is no end to the pace of progress. At least in the next year, this accelerating momentum will not slow down." This is his prediction for the future.
More details about the training of Gemini 3 and Sebastian Borgeaud's wonderful views are shared in the podcast. We translated the content, made deletions and arrangements without changing the original meaning for readers' enjoyment.
The "Secret Recipe" for Gemini 3's Strength: Better Pre - training and Post - training
I'd like to start with a tweet from Oriol Vinyals. Oriol is the vice - president of research and deep learning at Google DeepMind and also the co - leader of Gemini. When Gemini 3 was released, he said that the secret behind the model is very simple: better pre - training and better post - training. Considering the magnitude of the leap of Gemini 3 compared to the previous state - of - the - art, this sounds quite plain. What's your take on it? In a sense, is it really that simple?
Sebastian Borgeaud: I'm not sure if it can be called a secret. At least from my perspective, it's quite normal. People sometimes expect a major change from one Gemini version to the next that will bring about a huge difference. In my experience, there may indeed be one or two things that contribute more to the improvement, but overall, it's the accumulation of many changes and the work of a very large team that makes Gemini 3 much better than the previous generations. I think this will become a recurring theme: Releases like Gemini 3 are the result of the joint efforts of a large team.
Matt Turck: What does this mean for AI progress? From the outside, it seems that just turning a few "knobs" has achieved a leap. What does this mean for the future? What can we expect next?
Sebastian Borgeaud: There are two points. First, it's still amazing that we can still make so much progress in this way, and the progress has not slowed down. There are many "knobs" and many improvements. We can almost find something to make the model better every day. Second, we are no longer building a single model but a system. People sometimes think that we are just training a neural network architecture, but in fact, we are also building the entire system around the network.
Matt Turck: What everyone is most concerned about is: What does this mean for the real progress of intelligence? We don't have to delve deeply into "AGI", but how should we understand the model's progress: Is it a path to intelligence, or just to perform better on a certain benchmark? What makes you believe that the core model is becoming smarter?
Sebastian Borgeaud: The performance on benchmarks is indeed continuously improving, and the design of leading benchmarks is becoming more and more difficult. Even for someone like me with a computer science background, some of the questions that the model can answer would take me quite a long time to answer. This is from the benchmark perspective. We conduct frequent evaluations and are very cautious about retaining test sets. But people often worry about overfitting on benchmarks or so - called benchmaxing. I don't think these concerns are well - founded.
Another aspect that gives me more confidence is that the time people within the company use the model to improve productivity is constantly increasing. Each new generation of the model can obviously do new things and provide greater help in research and daily engineering work than the previous generation. This also shows that the model is becoming more capable and doing very useful things.
Matt Turck: If we take a step back, are you still surprised by the current situation? From your perspective, are we ahead, on schedule, or behind compared to what you expected a few years ago?
Sebastian Borgeaud: It's easy to say "on schedule" in hindsight. If I'm honest with myself, I think we are ahead of where I originally thought we could be. When I started working on large - language models in 2019 or 2020, it was hard to believe the scale of what we are doing now and the capabilities of the models today. At that time, if we looked at the Scaling Law, they did point in this direction, and some people really believed in them. But I'm not sure if I would have bet heavily on it achieving the current state.
A follow - up question is: If we can maintain the same kind of progress in the next few years as in the past five years, where will it take us? I think some really cool things will happen in the next few years.
Matt Turck: Where do you think it will go in the short - term, two to three years? Will AI make new scientific discoveries and win the Nobel Prize?
Sebastian Borgeaud: This is part of it. In the scientific field, DeepMind has done a lot of work historically, and a large amount of work continues to move in this direction. I think there will be some major scientific discoveries in the next few years.
On the other hand, in my daily research and engineering work, I'm also looking forward to how we can use these models to drive more progress, while better understanding the systems we are building and further developing our own understanding and research.
Matt Turck: There is an important theme in the industry: Automating AI research and engineering. If we extrapolate, it will lead to a scenario like "AI 2027" with some kind of breakpoint. From a practical perspective, what's your experience of using AI in your work today? What will it mean in a few years?
Sebastian Borgeaud: I think it's more about making us faster rather than automating. It allows us to devote more time to the higher - level research part. In the daily work of language model research, we have to deal with very complex and large - scale systems at the infrastructure level. So, a considerable amount of time is spent on running experiments, monitoring experiments, analyzing data, and collecting results. The really interesting part is forming hypotheses and designing new experiments. I think the latter two parts will still be mainly done by us. The first part, especially in the next year, will be able to accelerate our work more and more as more agentic workflows are enabled.
Matt Turck: Do you think that basically all leading AI labs are doing the same thing in the same direction? There are new models almost every week or month, and we've been "spoiled". When Gemini 3 was just released, almost two hours before we recorded this podcast, GPT 5.2 was also released. What's your view? What will the future be like? Will someone stand out?
Sebastian Borgeaud: The work in different labs does have similarities, and the underlying technologies are also similar. I wouldn't be surprised if everyone is training similar Transformer - like model architectures. But on top of that, there is indeed specialization: Different branches of the research tree will be explored and utilized by different companies. For example, DeepMind has always been strong in vision and multimodality, and this is still true today, which is also reflected in the usage and benchmark performance. In terms of reasoning, OpenAI proposed the first model, but we also have relevant research. So, there are similarities, but they are not exactly the same.
As for whether someone will stand out in the future, I'm not sure. One thing is clear: To continue making progress on a model like Gemini today, a very large team and a large amount of resources are indeed required. But this doesn't mean that the current way is the optimal one. Disruptive research may emerge, enabling smaller teams to achieve a certain degree of superiority. This is also one of the reasons why I like working at Google: Google has a history of conducting more exploratory research, with a wide - ranging research scope. Many of these research projects are carried out in parallel with Gemini, and we can also utilize some of the progress and bring it into Gemini.
Matt Turck: Are there teams in DeepMind or other places in the industry secretly or completely secretly researching "post - Transformer" architectures? Will there be surprising results suddenly one day?