Xiaomi Luofuli: Fable 5 - A Phased Achievement Only

Xiaomi, ShengShu Technology, MB AI, and the leading AI expert from Nanyang Technological University engage in a face-to-face debate.

According to a report by Zhidx on June 12th, just now, another significant dialogue took place in the AI circle. At the 8th Beijing Academy of Artificial Intelligence (BAAI) Conference, four AI experts from Xiaomi, Shenshu Technology, Mianbi Intelligence, and Nanyang Technological University gathered on stage for a nearly 50 - minute in - depth dialogue, covering topics from the recently popular Claude Fable 5 to AI self - evolution and AI Coding.

The dialogue "Reconstructing the World - The Peak Dialogue of Chinese Large Models" was hosted by Wang Zhongyuan, the dean of BAAI. Luo Fuli, the person in charge of Xiaomi Group's MiMo, Zhu Jun, a professor at the Department of Computer Science of Tsinghua University and the founder of Shenshu Technology, Liu Zhiyuan, the co - founder and chief scientist of Mianbi Intelligence and a professor at Tsinghua University, and An Bo, a chair professor at Nanyang Technological University and the dean of the Institute of Artificial Intelligence Cross - Research, had a free - flowing chat about AI hotspots and finally gave advice to young people.

▲ The session of "Reconstructing the World - The Peak Dialogue of Chinese Large Models" at the BAAI Conference

Regarding Claude Fable 5, which has sparked intense discussions in the industry recently, Luo Fuli believes that its essence is still the natural result of continuous scaling, a phased achievement after the continuous expansion of the pre - training scale, data scale, and reinforcement learning in three dimensions. Her feeling about the current AI is that the development speed of large models and AGI is so fast that even practitioners like them are shocked.

The topic of Token economy remains hot. Zhu Jun believes that in the past, when using Agents or AI Coding to solve problems, a large amount of Tokens were often consumed. The reduction of Token consumption by the new - version model in tasks is a correct direction.

Liu Zhiyuan pointed out the reason for Anthropic's success and even its valuation surpassing that of OpenAI. He believes that Anthropic found the very important direction of code. In the future, forming a data closed - loop in the professional field will definitely accelerate AI applications. He also mentioned that the intelligent revolution is actually to use AI to replace people's mechanical and repetitive mental work, and the core driving force of AI creating AI is still humans.

Regarding AI self - evolution, An Bo believes that when AI capabilities are weak, for AI self - evolution to work, an important prerequisite is that the environment cannot be completely closed. If a data flywheel is built in a completely closed environment, it is difficult to succeed.

Finally, the four experts gave advice to young people:

Luo Fuli: Keep the desire for exploration and curiosity, and use the latest large models as deeply as possible; Liu Zhiyuan: Be the first to take action, persist in the long run, and continuously innovate yourself; Zhu Jun: Actively embrace this era and actively use AI; An Bo: Choose the right track and do important things, which is very important.

The following is the transcript of this dialogue. Zhidx has made non - meaning - changing edits for easy reading:

01. The latest concerns of the four experts: Self - evolution, world models, intelligence density, and Harness

Wang Zhongyuan: Good morning, everyone. Friends who have been following the BAAI Conference should have noticed that in the opening ceremonies of the past two years, we have had a highly - concerned round - table session. The year before last, we discussed "The Road to AGI", and last year it was "The Embodied Living Room". In these round - tables, we would invite representative experts and scholars in the industry to jointly discuss the most cutting - edge issues in the AI field.

This year, the theme of the round - table dialogue is "Reconstructing the World". Why "Reconstructing the World"? Because we are standing at a new historical critical point. Artificial intelligence is no longer just a tool to transform a certain industry but is becoming the underlying force to reconstruct the world. AI Coding, autonomous agents, and model self - evolution are opening up the possibility of AI creating AI.

World models, embodied intelligence, and robots are extending intelligence from the digital world to the physical world. In the future, the important competition will be who can first master the ability to create intelligence, control intelligence, and let intelligence reshape reality. Therefore, we define this round - table dialogue as "Reconstructing the World".

Before the formal start, please the four guests briefly introduce themselves and share one or two technical issues they are most concerned about recently.

Luo Fuli: Hello, everyone. I'm Luo Fuli, the person in charge of Xiaomi's MiMo large - model team. Now the development of AI is very magnificent, and it's hard to summarize it in one word. Recently, the direction I'm most concerned about is Self Improvement, especially in the Auto Research field.

Zhu Jun: Hello, everyone. I'm Zhu Jun from Tsinghua University. Currently, I'm also working on general world models. Recently, I'm most concerned about the model architecture with video as the native form and how to use this model to enter the physical world, connecting the ability to understand, predict, and act in the world.

Liu Zhiyuan: Hello, everyone. I'm Liu Zhiyuan, a professor at the Department of Computer Science of Tsinghua University, and also the co - founder and chief scientist of Mianbi Intelligence. Recently, our focus is on the 'law of intelligence density' of large models. We hope to train the intelligence density of the model to be higher and higher, making the model more and more capable, and finally empowering various intelligent terminals.

An Bo: Hello, everyone. I'm An Bo from Nanyang Technological University, and I also have some part - time jobs in the industry. Recently, we are concerned about Agent Harness. Given the capabilities of the base model, how to obtain stronger reasoning abilities through a better Harness mechanism.

02. How to view Claude Fable 5? It is essentially the result of continuous scaling

Wang Zhongyuan: Just now, several teachers mentioned that the current technological development is still very fast. Let's start with Fable 5, which was released two days ago. Anthropic's newly released Fable 5 has significantly improved in programming ability and Agent ability. The official - shared case shows that for a code library with 50 million lines of code, if an artificial team is used to complete the full - library migration, it will take a month, while using Fable 5 only takes one day.

I'd like to invite everyone to share their views on this model and the latest progress of AI Coding. Is the current development still an accumulation of quantitative changes, or is it approaching the critical point of qualitative change? At the same time, all of you are training models. Is the model's ability still accelerating? Let's start with Luo Fuli.

Luo Fuli: In my opinion, the capabilities shown by Fable 5 at present are still essentially the natural result of continuous scaling.

Firstly, there is scaling in the pre - training stage. We speculate that the parameter scale of Fable 5 may reach several times that of the current strongest open - source model. Secondly, in Test - Time Scaling and reinforcement learning, a large amount of computing resources have also been invested. In addition, as the industry moves from the Chat era to the Agent era, the training data has also changed. Model training is expanding from Internet text data to synthetic data jointly generated by humans and Agents, and the data scale has entered a new level. In the past, the scale of Unique Tokens in Internet text data was approximately between 40T and 80T, and now the data scale has reached a new stage.

Therefore, I believe that Fable 5 is a phased achievement after the continuous expansion of the pre - training scale, data scale, and reinforcement learning in three dimensions.

▲ Luo Fuli, the person in charge of Xiaomi Group's MiMo

Wang Zhongyuan: So you think it is still an intermediate - stage model?

Luo Fuli: Yes. At least from the several dimensions mentioned just now, none of them have stopped, and the relevant scaling paths are still being continuously promoted.

Wang Zhongyuan: Xiaomi's MiMo has also performed very well recently and has a high ranking on OpenRouter. From your observation, is the improvement of large - model capabilities closer to linear growth or exponential growth?

Luo Fuli: It's hard for me to describe it with a fixed curve. Because the improvement of model capabilities is often an emergent process. Whether on different scaling paths or at different stages, we can see similar emergence phenomena. Therefore, it's hard to simply summarize it with linear growth or exponential growth.

Wang Zhongyuan: Invite Teacher Zhu.

Zhu Jun: I haven't directly trained language models myself, so Luo Fuli may be more qualified to speak on this issue. However, from the feedback of teachers and students around me, people generally believe that Fable 5 has significantly improved compared to the previous generation. Some people even joked that they used to think they were mentors, but now they feel that the model has become the mentor. Combining our own experience in training video models and world models, when the model scale and data scale continue to expand, the performance improvement is still very significant.

In the past two years, we have seen very obvious progress in physical law modeling, simulation, and world simulation. Initially, people often saw various hallucination problems, but now high - quality and professional - level content can be generated, reaching the practical - use level in many scenarios. These progressions essentially come from the same path: larger models, higher - quality data, and larger - scale training.

When the model further moves towards the physical world, a frequently discussed question is: Can the model really learn physical laws? My view is that as the capabilities of the basic model continue to improve, learning rigorous logic, physical laws, and 3D world understanding on this basis will become more efficient. In the future, many scenarios do not require extremely precise physical simulation, and a large number of tasks can be completed with intuitive understanding. This is the important value brought by large models.

Regarding Fable 5 itself, I need to experience it further to make a more specific evaluation. But I very much agree with one point. In the past, when people used Agents or AI Coding to solve problems, they often consumed a large amount of Tokens, while the new - version model significantly reduces Token consumption in enterprise tasks. I think this is a very correct development direction. For many complex tasks, the model should rely on higher - level intelligence to call tools and organize reasoning, rather than simply relying on more Token consumption. This is an important direction for large models to continuously release value in the future.

Wang Zhongyuan: Thank you, Teacher Zhu. I'd like to ask a follow - up question. Now we can see that the scaling paradigm still exists in large - language models, and the performance is still improving. Has the scaling boundary of video - generation models been reached? Or is it still possible to achieve better performance by adding more data and using larger models?

Zhu Jun: For video and world models, I think it is still in the process, and the potential is very large.

Recently, everyone has been paying attention to the new model of Seedance. Some of the shared information shows that it is more radical in the Scale - Up architecture than the previous models, and very significant results have been seen. If extended to more general world models, I believe that the Scale - Up path may still be very long. Maybe now people are talking about the increase in the acquisition of physical data, more efficient use of data, or architecture optimization. I think this has just started, and there is still a lot to explore in the future.

▲ Zhu Jun, a professor at the Department of Computer Science of Tsinghua University and the founder of Shenshu Technology

Liu Zhiyuan: I'd like to share three points of thinking.

The first point is that, as Fuli just said, this is an embodiment of sustainable scaling. The logic behind it is that we can find a closed - loop of a sustainable data flywheel.

Whether it was the success of reinforcement learning in 2024 and 2025, or Anthropic's Claude Code this time, it can collect feedback from around the world and collect some data generated by people using code. In fact, it constitutes a strong driving force for sustainable development, which is a very important inspiration.

From the second perspective, code is actually a very important productivity tool in the digital world. Obviously, the continuous improvement of the capabilities of the code large model will have a subversive impact on all industries that require code, such as industrial software and vulnerability discovery.

This matter itself is very important after the data is relatively mature. I think we need to consider together the possibility of innovation and exploration. That is, is it possible that for some industrial software that used to be bottlenecked, we can rewrite it through the code large model to form a new domestic ecological environment.

The third point, which I think is more inspiring, is that the reason why the code large model can quickly form a closed - loop is that its data is completely generated in the digital world, and it is very easy to form a closed - loop. Then we can imagine that the reason for Anthropic's success is that it found a very important vertical direction like code. I imagine that in our world, human professional knowledge exists in many fields.

If we can quickly form a data closed - loop in these professional fields, we will definitely accelerate the rapid application of AI in various industries. I think Anthropic's success in the code large model, and even its current valuation higher than that of OpenAI, is an inspiration for us. We should innovatively look for more possibilities of different data closed - loops. These are my three points of thinking. Thank you, Teacher Liu.

Wang Zhongyuan: Teacher Liu, you think there are still opportunities in new fields, and the AI data closed - loop may create new value. Teacher An, what do you think of the Fable 5 model?

An Bo: In the past two days, we haven't trained any models. We've been doing Harness. We've tried different models, and it has a great impact on the final result.

Several teachers in front have also shared a lot. Personally, I think the self - development trend is very popular recently, whether it's Codex or Claude Code. By obtaining more data used by people or getting more feedback, the model's capabilities can be continuously enhanced. As Teacher Liu just said, Coding is very important. When we do reasoning, for those problems that can be solved by Coding, if your model has strong Coding ability, it will be very useful. Of course, not all problems can be solved by Coding. There are many problems that cannot be solved by writing code at all, and other methods may need to be found.

Wang Zhongyuan

该文观点仅代表作者本人，36氪平台仅提供信息存储空间服务。

Xiaomi Luofuli: Fable 5 is only a phased achievement

01. The latest concerns of the four experts: Self - evolution, world models, intelligence density, and Harness

02. How to view Claude Fable 5? It is essentially the result of continuous scaling