HomeArticle

Tang Jie, Founder of Zhipu, had a remote dialogue with Elon Musk: It won't take until 2027 to surpass Claude Fable 5

AI前线2026-06-20 13:00
Recalibrating the Time Gap in Large Model Capabilities Between China and the US

On the evening of June 18th, Elon Musk and Tang Jie, the founder and chief scientist of Zhipu, had a remote conversation on X.

The cause was a question from a netizen: When can the gap between Chinese and American large models be closed? "When do you think China will reach the level of Fable? GLM-5.2 will definitely narrow the gap."

Subsequently, Teortaxes, a technology enthusiast and a small KOL in the tech circle (with 65K followers and quite accurate technical interpretations of DeepSeek every time), responded to this.

He first positioned Zhipu's GLM-5.2, believing that it is now at about the level of Claude Opus 4.7 - 4.8. (The visual understanding part is counted separately because Zhipu currently cannot achieve full-modal unity. Of course, he also thinks Opus is quite bad in this aspect.)

From this, it can be inferred that there is currently a time difference of 7 months between Chinese and American models.

Based on the timeline of Mythos, he gave a reference point: The Mythos series reached the Preview level in early February 2026, and its functions caught up with or exceeded Opus 4.8. If calculated according to Claude's catching-up speed, if China wants to produce a model "with capabilities comparable to the full version of Mythos", the time will probably fall between November and December 2026.

Subsequently, Elon Musk joined in the discussion. He thought it might be one quarter later, "possibly in Q1 of 2027".

Then, the real GLM-5.2 arrived on the scene. Professor Tang Jie responded to Musk's judgment with a casual " It won't take that long", showing off modestly. The implication is also a hint that "Domestic large models represented by Zhipu (especially Zhipu) are expected to achieve a leap within this year."

The industrial and academic representatives in the AI field of China and the United States had a remote exchange, discussing the most concerned issue of "the gap between Chinese and American models" at present, and the heat in the comment section rose rapidly.

The onlookers began to split into two camps. One excited camp believes that GLM has achieved remarkable results with its current scale, and GLM6 at the end of the year is worth looking forward to. For example, GLM-5.1 couldn't even get a ranking in the global Harvey legal Agent professional field test (it was said to be zero points), but GLM-5.2 has been able to rank in the top three, which shows the rapid iteration speed of Zhipu's model.

The other calm camp believes that the current GLM-5.2 doesn't even have the memory ability for cross-dialogue, so it doesn't make much sense to catch up only in the benchmark.

Musk also agreed with this. He said that it might be possible for both China and the United States to catch up in the benchmark before the end of the year, but if measured by real practicality, it would be quite remarkable even in Q1.

He believes that Anthropic has always focused on maximizing "useful intelligence". This won't be reflected in the benchmark scores, but it will definitely be reflected in the revenue.

This comparison once again positions Zhipu as the "Chinese version of Anthropic".

On the one hand, both companies have a strong academy/research-oriented founding gene, and pay more attention to underlying innovation and long-term value when pursuing technological frontiers.

On the other hand, in terms of commercialization rhythm and market recognition, Anthropic broke through in the B2B coding market, occupied the minds of professional users, and thus created a steep growth curve and a stable business model. In this regard, it is very similar to Zhipu, which also performs well in B2B business.

Last week, Anthropic launched its latest flagship model, Claude Fable5, but was widely criticized for actions such as dumbing down the model secretly and setting access restrictions by region. In contrast, Zhipu immediately launched GLM-5.2, which topped all open-source weight models with 51 points, far ahead of MiniMax-M3 (44 points), DeepSeek V4 Pro (44 points), and Kimi K2.6 (43 points), and fully open-sourced it under the MIT license.

Comparing the attitudes of the two companies towards the open-source community and users, Zhipu not only gained a lot of good reviews and popularity, but also saw its stock price skyrocket. The cumulative increase in the past five working days was as high as 99.81%, almost doubling.

In fact, Professor Tang Jie had already announced a more significant model update of Zhipu - native multi-modal - last month. (Tang Jie, the founder of Zhipu, revealed that the native multi-modal model will be launched within a few months.)

At the beginning of May, his answer was: It will be launched within a few months. After Professor Tang Jie emphasized the time schedule again, the global head of Zhipu also reposted the tweet, announcing that "big things are about to happen", which also means that a major version upgrade of GLM is approaching.

After the release of GLM-5, Zhipu has made efforts in coding and long-running agent tasks, and its open-source ecosystem ranks among the top in the world. However, in the field of multi-modal, especially native multi-modal, it still needs to give a clearer answer to the outside world.

How important is this answer? Kimi's K2.5, released at the end of January this year, already has a native multi-modal architecture; Alibaba's Qwen3.5-Omni was launched in March, pre-trained end-to-end based on more than 100 million hours of audio and video data; and GPT-4o completed the implementation of the native multi-modal architecture as early as April last year.

The understanding and construction of multi-modal have become the most crucial dimension for leading models to widen the gap. In Tang Jie's tweet, he shared the strategic significance of building multi-modal capabilities: Perceiving the environment is the basis for completing long tasks. Multi-modal is not just a functional addition, but a prerequisite for the real implementation of agents.

Therefore, filling the gap in multi-modal is not only a necessary condition to support the next capital narrative, but also a necessary path for Zhipu to complete the closed loop of the technical route.

However, the author believes that to achieve the goal of catching up with Fable5, domestic models not only need to push the model to the trillion-level parameters in the pre-training stage, but the greater challenge lies in the post-training stage to enable the model to (partially) run self-training and self-iteration (recursive self-improvement, RSI).

Finally, let's bring the topic back to the evergreen topic of "the gap between Chinese and American models". Dario, the CEO of Anthropic, once gave his "end-game judgment" from his perspective.

In the report "2028: Two scenarios for global AI leadership" in May, he gave two assumptions: one is that the United States and its allies maintain their leading advantage, and the other is that China catches up with the United States.

Of course, the whole report calls on the United States to lock its leading advantage between 12 and 24 months by plugging loopholes such as chip smuggling, access to overseas data centers, and distillation attacks.

This means that in January 2028, China's best model will at most catch up with the level of the US model in January 2027. In other words, there is a gap of at least one year.

However, it now seems that unless Dario makes rapid progress this year, the generation gap between China and the United States is likely to further narrow.

Disclaimer: This article is an original by AI Frontline. It does not represent the views of the platform and does not constitute investment advice. Reproduction is prohibited without permission.

This article is from the WeChat official account "AI Frontline" (ID: ai-front). The author is Qing He. It is published by 36Kr with authorization.