Tang Jie, Yang Zhilin, Lin Junyang, and Yao Shunyu made a rare appearance together. The "Four Giants of Schema" discussed China's AGI.
Text | Zhou Xinyu, Deng Yongyi
Compilation | Wang Xinyi, Wei Ning
Editor | Su Jianxun
On January 10, 2026, at the AGI - Next Frontier Summit initiated by the Beijing Key Laboratory of Fundamental Models at Tsinghua University and Zhipu AI, the four most important players in China's current large - model field gathered together on a rare occasion:
Zhipu, which successfully listed on the Hong Kong Stock Exchange on January 8 and was also the host of this closed - door meeting. Tang Jie, its founder and chief scientist, was the first guest to share on - site.
Yao Shunyu, who recently officially announced joining Tencent, made his first appearance in public. After Tencent recently completed a crucial model - team restructuring, this former OpenAI researcher assumed the position of Chief Scientist in the CEO's Office.
Sharing the same big - company camp as Yao Shunyu is Lin Junyang, the technical leader of Alibaba's Qwen and the youngest P10 in Alibaba's history. Behind him, Alibaba's Tongyi Laboratory currently ranks first in the world in terms of the number of derivatives and downloads of open - source models.
Another important force at the closed - door meeting was the "Six Little Tigers," which have been at the center of public opinion recently. Yang Zhilin, the CEO of Dark Side of the Moon, recently announced a new round of $500 million in financing.
If we were to identify the greatest consensus in China and even the world regarding AI in 2025, one thing would surely be on the list: The capabilities of fundamental models determine the outcome of many upcoming competitions, such as the ability to become the next super - entry point or the next great company...
The four main characters at the closed - door meeting, representing companies at different stages and with different business models, all had a common theme in their series of actions since 2025: Consolidate their position in the first echelon of fundamental models and drive business development with these models.
A year ago, marked by the sudden emergence of DeepSeek, China's large models gained good international recognition through rapid iterations and continuous open - source efforts.
However, at the meeting, Tang Jie poured a bit of cold water on Chinese developers: "The gap between large models in the US and China may not have narrowed because the US still has a large number of closed - source models that have not been open - sourced."
Where exactly should the next - generation route of AGI lead? The definitions of the next - stage AGI paradigm among the guests differed, which also determined the differences in their exploration routes.
In Tang Jie's view, After the emergence of DeepSeek, the exploration of the Chat paradigm has basically ended. Regarding the model - training paradigm in the post - DeepSeek era, he described Zhipu as having "bet" on Coding and Reasoning. GLM - 4.5, which integrates reasoning, agentic, and coding capabilities, is a successful outcome of their bet.
For Yang Zhilin, a "believer in the Scaling Law," Scaling remains the focus in the next stage. However, the new change is that Scaling no longer simply means increasing computing power but rather making technological improvements at the architecture, optimizer, and data levels. The goal is to give the model a better "Taste": "Taste is something we firmly believe in. The intelligence of the model will produce many different Tastes, which prevent the models from converging."
In the next stage, The autonomous learning of AI is a direction that all four guests are optimistic about.
However, there is a consensus among them that As the exploration paradigm of AGI changes, it will be very important to establish a new standard for measuring the intelligence of models.
Yang Zhilin defined the intelligence level of AI as a combination of Token Efficiency and Long Context: "This means how much of an advantage your model has at different context lengths."
Tang Jie also had a similar view. He observed that the returns from the current crazy RL and Scaling are much lower than before. Therefore, he defined a new paradigm for measuring intelligence levels, Intelligence Efficiency, to measure the ROI of model input and intelligence gain.
In fact, what drives different AI exploration paradigms is more about goal selection: whether to pursue the peak intelligence of the model or focus on implementation. This determines the model - training strategy: whether to vertically integrate or conduct differentiated training.
On this issue, the two representatives from big companies, Lin Junyang and Yao Shunyu, reached a consensus: In the future, the differentiation between ToC and ToB will become more and more obvious, and the essence of AGI is to serve real human scenarios.
Yao Shunyu believes that in the ToC scenario, vertical integration is feasible. Whether it's Doubao or ChatGPT, the model and the product must be strongly coupled and iterated to provide a good user experience. However, in the ToB scenario, it's the opposite. Model companies focus on strengthening the model, while application companies seek to use the strongest models to improve productivity, resulting in a differentiated relationship.
Lin Junyang is more inclined to believe that this differentiation occurs naturally. "Companies don't have such distinct genetic differences. Both ToB and ToC are about serving real humans." He mentioned that Anthropic didn't succeed because of its excellent coding but because of its frequent communication with enterprise customers, which led to the discovery of real needs. Today, coding accounts for an absolute majority of API consumption in the US.
The following is a compilation of the AGI - Next round - table discussion, edited by "Intelligent Emergence":
Li Guangmi: Shunyu, could you elaborate on your thoughts on the topic of model differentiation?
Silicon Valley is experiencing differentiation, and Chinese models are also being open - sourced. For example, Anthropic focused on coding, while Google Gemini didn't cover everything but instead excelled in full - modality. Your former employer (OpenAI) focused on ToC. With your cross - cultural experience in both the US and China, how do you feel about this?
Yao Shunyu: I have two main feelings. First, there is a growing differentiation between the path of technological integration and the path of model - application layering.
Let me start with the differentiation between ToC and ToB. When we think of AI super - apps, there are currently two: ChatGPT and Claude, which can be regarded as examples of ToC and ToB respectively. Interestingly, for most people, the experience of using ChatGPT today is not as different from last year as one might expect.
In contrast, the coding revolution hadn't started a year ago. In the past year, to put it exaggeratively, Claude has been reshaping the way the entire computer industry operates. People no longer write code but communicate with computers in English.
The core issue is that for most ToC users, they don't actually need such a high - level of intelligence most of the time. Even if the model's ability to write abstract algebra has improved, most people won't notice. They mainly use it as an enhanced search engine.
However, in the ToB scenario, higher intelligence means higher productivity and more profit.
Another obvious point is that many people in the ToB market are willing to pay a premium for the strongest models. For example, if one model costs $200 per month and the second - best one costs $50 per month, many Americans are willing to pay the premium because it can improve their work efficiency. A very powerful model like OpenAI 4.5 can get eight or nine out of ten tasks right, while a less powerful model may only get five or six right. Then there's the additional problem of having to spend extra effort on monitoring because you don't know which five or six tasks it will get right.
Therefore, I've noticed a very interesting phenomenon: In the ToB market, the differentiation between strong and weak models will become more and more obvious.
The second observation is the differentiation between vertical integration and model - application layering. In the past, people thought that having vertical - integration capabilities would lead to better results, but that's not necessarily the case today. The model layer and the application layer require different capabilities. For ToB productivity scenarios, larger pre - trained models are crucial, which is difficult for product companies to achieve.
Conversely, to make good use of a great model or to leverage the model's overflow capabilities, a lot of work needs to be done on the application and environment sides.
We can see that in ToC applications, vertical integration is valid. Whether it's ChatGPT or Doubao, the model and the product are strongly coupled and closely iterated.
However, for ToB, the trend seems to be the opposite. Model companies focus on making the model stronger and stronger, while the application layer aims to use the best models to empower different productivity segments.
Li Guangmi: You've recently taken on a new role. In the Chinese market, what are your ideal bets? Are there any distinct features or keywords you can share?
Yao Shunyu: Tencent has a stronger ToC gene. We're thinking about how to use large models to provide more value to users. We've found that in many cases, the bottleneck in ToC is not larger models or stronger reinforcement learning but additional context and environment.
I often give an example. If you ask the model "What should I eat today?", whether you asked ChatGPT last year or this year, the answer might be poor.
To improve the answer, What's needed is not a stronger model or a better search engine but more additional input. If the model knows that it's cold today and you want something warm to eat, or that your wife is in another place and what she wants to eat... With this context, the quality of the answer will be completely different.
For example, we can forward WeChat chat records to Yuanbao to give the model more useful input, which will bring a lot of additional value to users.
As for ToB, it's really a difficult task in China. Many companies doing Coding Agents actually target overseas markets. In this regard, we'll think about how to serve ourselves better first.
The difference between large companies and startups in doing coding is that large companies already have various application scenarios and productivity - improvement needs. If our model can perform better in these internal scenarios, not only will the model have unique advantages and the company will develop better, but more importantly, we can capture more diverse scenario data from the real world.
Companies like Anthropic and OpenAI are startups. They need to find data vendors to label data, but the number of people data vendors can hire and the scenarios they can think of are always limited, resulting in limited diversity.
However, if you're a company with 100,000 employees, you may have many interesting attempts to truly make use of real - world data instead of relying solely on labelers or distillation.
Li Guangmi: Junyang, what do you think of Qianwen's future ecological niche?
Lin Junyang: Companies may not have such distinct genetic differences. They can be shaped by generations of people. For example, after Shunyu joined Tencent, Tencent may become a company with Shunyu's influence (laughs).
Today, both ToB and ToC are about serving real humans. So the essence of this question is: How can we make the human world a better place? Even ToC products will further differentiate, for example, becoming more focused on medical or legal fields.
I'm inclined to believe that Anthropic can do better not because of its excellent coding but because of its extensive communication with the B - side. I've communicated with many API vendors in the US, and they were all surprised by the large amount of token consumption in coding. In China, the token consumption in coding is not that high yet.
Today, Anthropic is more focused on finance - related areas, which is an opportunity they discovered through communication with customers.
So the differentiation among companies may be a natural process. I'm more inclined to believe in AGI and let things take their natural course.
Li Guangmi: What's your view on the differentiation issue, Professor Yang Qiang?
Yang Qiang: Historically, the academic community has been an onlooker while the industrial community has been leading the charge. As a result, many academics are now also involved in industrial - related work.
This is a good thing. At the beginning of astrophysics, it was mainly about observation, and then theories emerged. When a large number of large models reach a stable state, the academic community should catch up.
The academic community needs to address some issues that the industrial community hasn't had time to solve, such as the upper limit of intelligence. Given a certain amount of resources, how well can we perform? More specifically, how should resources be allocated? How much should be allocated to training and how much to reasoning?
In the early 1990s, I conducted a small experiment. If we invest a certain amount in memory, to what extent can memory assist reasoning? Will this assistance turn negative? Will too much memory become noise? Is there an equilibrium point? These methodological issues are still relevant today.
I've also been thinking about another problem recently. There's an important theorem in computer science called "Gödel's Incompleteness Theorem," which roughly means that a system (large model) cannot prove its own consistency, and it must have some unavoidable illusions.
So the question arises: How much resources are needed to reduce illusions or the error rate? There's an equilibrium point in between. This equilibrium point is very similar to the balance between risk and return in economics, also known as the "No - Free - Lunch Theorem."
These issues are particularly suitable for joint research between the academic and industrial communities.
Just now, Professor Tang Jie also mentioned continuous learning, which involves the concept of time. How can we ensure that the learning ability of large models doesn't decline during continuous learning?
Humans have a way: sleeping. I recommend reading a book called "Why We Sleep" written by two MIT professors. It mentions that sleeping at night is actually about clearing noise, which allows us to improve our learning accuracy the next day and avoid the superposition of error rates.
These theoretical studies are giving birth to new computing models. Today, we may be more focused on Transformer Agent Computing, but it's necessary to explore new areas. The industrial and academic communities need to work together.
Li Guangmi: Today, Zhipu seems to be following Anthropic's path, with a very strong coding ability. What's your view on the topic of differentiation, Professor Tang Jie?
Tang Jie: In 2023, we were the first to develop a Chat system. So our first thought was to quickly launch Chat. However, when it was launched in August or September 2023, a dozen large models were launched simultaneously, and each had relatively few users.
Of course, the situation has become even more differentiated today. After a year of reflection, the reason is that Chat doesn't really solve problems. In our original prediction, Chat would replace search. Today, I believe many people are starting to use models to replace search, but they haven't replaced Google. Instead, Google has revolutionized its own search.
In this regard, the battle of Chat has ended since the emergence of DeepSeek. We should think about what our next bet should be. At the beginning of 2025, our team debated for a long time and decided to bet on Coding. Then we dedicated all our efforts to coding.
Li Guangmi: Betting is a very interesting thing. My feeling is that in the past year, China has not only been strong in open - sourcing but also has its own bets, and there may be further differentiation in the future. Because people are not only pursuing general capabilities but also leveraging their own resource advantages to excel in their areas of expertise.
Today, three years have passed since pre - training, and RL has become a consensus. Silicon Valley is discussing the next new paradigm: autonomous learning.
Shunyu worked at OpenAI, which promoted the Transformer and RL paradigms. How do you think about the next paradigm?
Yao Shunyu: Currently, autonomous learning is a very popular topic. People in cafes all over Silicon Valley are talking about it, and it has become a consensus