The most low-key "Six Little Tigers" make a leap to the stars with the opening show of the year: Agent lands on intelligent terminals, and Yin Qi is also here | The Frontline
Written by Zhou Xinyu
Edited by Su Jianxun
On the Ecological Open Day held on February 21, 2025, "AI Six Tigers" StepStar submitted a model answer sheet. It includes not only the exploration of the next stage of AGI and the form of model implementation, but also implicitly includes StepStar's attitude towards DeepSeek.
In 2024, StepStar was undoubtedly the "king of involution" in the model layer. Last year, this youngest of the six tigers released 11 models in one go, covering multiple modalities such as language, speech, vision, and reasoning.
StepStar Model Matrix.
The rapid progress in the model layer is related to StepStar's AGI exploration route. StepStar CEO Jiang Daxin once told "Intelligent Emergence" that since its establishment on the first day, StepStar has determined the AGI path: single modality - multi-modality - the unification of multi-modal understanding and generation - world model - AGI (Artificial General Intelligence).
It can be said that in 2024, StepStar has reached the node of multi-modality. But before starting "the unification of multi-modal understanding and generation", the first thing this "six tigers" needs to face is DeepSeek's disruption.
Since 2024, the large model track has undergone many changes due to DeepSeek, this "catfish". On the one hand, DeepSeek API took the lead in starting the model price war, bringing the price of large models down to 1 yuan per million tokens; on the other hand, the open-source reasoning model DeepSeek R1 has made the industry begin to reflect on the violent aesthetics of the Scaling Law.
Many practitioners believe that the impact of DeepSeek on the six tigers is huge. The open source of high-performance models such as R1 has made the commercialization of closed-source models face more challenges. And the low-investment reinforcement learning training paradigm adopted by R1 has also raised more doubts about the high valuations and large-scale burning of money by the six tigers.
How to face DeepSeek has become the most important proposition for the six tigers at present. Reducing the importance of the model API business and turning to the C-end is the direction of the current strategy adjustment of several model enterprises. For example, MiniMax has reduced the To B team and integrated DeepSeek R1 into its own AI assistant platform.
StepStar's timely response is to open source.
Even though it did not directly challenge DeepSeek, these two models that were open sourced immediately after the release of R1 are also regarded as StepStar's silent response to defend its technical position. It is worth noting that StepStar open sourced two multi-modal models, which is different from DeepSeek that focuses on text models.
StepStar Open Sources Two Multi-Modal Models.
One of the open source models is the 30-billion-parameter graph-to-video model Step-Video-T2V, which is also the graph-to-video model with the largest number of parameters in the world at present. The other one that is open sourced is the 130-billion-parameter speech interaction model Step-Audio.
Jiang Daxin revealed on the Open Day that in March 2025, StepStar plans to open source another graph-to-video model.
Returning to the next stage of exploring AGI, multi-modal reasoning is the model development direction that StepStar is optimistic about.
This view has also been recognized by many people in the industry. For example, "Intelligent Emergence" once exclusively reported that Shen Dou, Executive Vice President of Baidu Group and President of Baidu Intelligent Cloud Business Group, judged that the focus of the industry will shift from training to reasoning, and multi-modality will become the mainstream demand.
The shift from generation to reasoning in multi-modality means that multi-modal models can not only generate pictures and videos, but also be able to understand the content in them.
StepStar's Progress in Multi-Modal Reasoning Models.
On the Ecological Day, StepStar announced Open-Reasoner-Zero developed with Tsinghua University, which is the first open-source reasoning model for large-scale reinforcement learning directly from the pre-trained model, with an efficiency 25 times that of DeepSeek-R1-Zero.
Jiang Daxin also revealed an internal project that is being advanced: the visual reasoning model. He mentioned that this model can achieve slow thinking in the visual space. This means that when seeing a route map, the model can answer "What is the final destination by following the arrows".
The model implementation direction that StepStar is optimistic about is AI Agent.
Why is 2025 the outbreak year of Agent? In Jiang Daxin's view, the two key factors affecting Agent, multi-modality and slow thinking (long-thinking-chain reasoning can solve complex problems), have made significant progress in 2024.
StepStar's attitude towards Agent is to implement it directly. Jiang Daxin divides Agent into two categories: vertical Agent and intelligent terminal Agent.
StepStar Agent Cooperation Ecology.
In both directions, StepStar chooses to create an ecological co-creation with downstream customers. For example, in the vertical field, StepStar and Cai Lian She developed the financial and economic information assistant "AI Little God of Wealth"; in the intelligent terminal field, StepStar's Agent has also been integrated into the products of terminal manufacturers such as Geely Automobile, OPPO, and Robosen Robotics.
The financial and economic information assistant "AI Little God of Wealth" jointly developed by StepStar and Cai Lian She.
It is worth mentioning that Yin Qi, who founded "AI Four Little Dragons" Megvii Technology, appeared at the roundtable forum of the Ecological Day with a new identity. Now, he is the chairman of the autonomous driving company "Thousand Miles Technology". He believes that the most successful AI products are still Tesla and Douyin, but large models will also bring a larger market space to applications.
Yin Qi, the founder of Megvii Technology and the chairman of Thousand Miles Technology, participates in the StepStar Roundtable Forum.
Welcome to Communicate!
Welcome to Follow!