Four-hour interview between YAN Junjie of MiniMax and LUO Yonghao: Finding the third way out for China's AI. The mountains are not insurmountable.
While the entire AI circle is anxious about DAU (Daily Active Users) and financing amounts, Yan Junjie, the founder of MiniMax, shows an almost cold - blooded indifference.
Sitting across from Luo Yonghao, Yan Junjie doesn't seem like a tech upstart in charge of an AI unicorn enterprise.
He refuses to talk about changing the world and instead admits his fear. That fear doesn't come from business competition but from technology itself - when the capabilities of the model start to surpass humans, the creators become the first to feel uneasy.
As long as something can be quantified, the model will definitely be stronger than humans or at least reach the level of the best humans. All relatively successful models make their creators a bit scared before they are developed.
According to an interview by LatePost, within MiniMax, DAU, which is regarded as the golden rule in the Internet industry, is directly defined by Yan Junjie as a "vanity metric".
In 2025, surrounded by giants, short of computing power, and with the ebbing of hot money, MiniMax is making a correction in cognition: no longer following the logic of the mobile Internet, that is, achieving growth through large - scale investment and retaining users by piling up functions, but returning to the essence: regarding the model as the most important product.
In the era of large models, the real product is actually the model itself, and the traditional product is more like a channel. If the model is not smart enough, it's useless to have a well - made product.
In this conversation between Luo Yonghao and Yan Junjie, I found that MiniMax, this AI company, chose a technological path that is destined to go against the mainstream from the very first day of its establishment.
While everyone is trying to find China's OpenAI and Sam Altman, Yan Junjie is trying to prove the value of "non - geniuses". The story of MiniMax is not about the sudden inspiration of a genius but a precise experiment on how to tear open a narrow door to AGI through extremely rational calculation and correction in the cracks of limited resources.
Reaching AGI with 1/50 of the chips
On the surface, MiniMax's technological path in the past three years seems like a series of isolated bets, but actually, there is a unified logical thread hidden in it: on the premise of limited resources, how to optimize in a smarter way rather than piling up more computing power to approach the upper limit of AGI.
When the industry was still focused on text, MiniMax made an extremely risky decision at that time: betting on full - modality on the first day of its establishment. Yan Junjie later explained that they had a clear idea from the beginning that true AGI must have multi - modal input and multi - modal output.
There was no ready - made technological path when they started their business more than three years ago. Their strategy was to make each modality work at least first and then integrate them when the time was right. This persistence was highly questioned at that time - the mainstream in the industry believed that they should first focus on a single modality and perfect it.
But Yan Junjie's logic is that the essence of AGI is multi - modal fusion. If different modalities are not advanced synchronously now, the technical debt will become a fatal flaw when fusion is needed. This non - consensus persistence has enabled MiniMax to have full - modality capabilities ranking first in global audio, second in video, and firmly in the first echelon in text in 2025.
Recently, OpenAI's Sora 2 achieved remarkable results through multi - modal fusion, which also confirms to some extent the foresight of MiniMax in choosing this technological path at the beginning of its establishment.
But even more radical is that Yan Junjie broke the traditional model of AI research at the beginning of the establishment of the company.
This was the first cognitive breakthrough when the company was just founded - to do a good job in large models, one must not blindly believe in previous experience but analyze the problem from first principles. About four or five years ago, people in the field of artificial intelligence pursued writing a lot of mathematical formulas and making theories look good and fancy. But the core of this generation of artificial intelligence is actually Scaling (Scaling Law), which means using the simplest method to achieve better results, and as the data and computing power increase, the results can continue to improve.
Yan Junjie's technological intuition comes from his internship experience at Baidu in 2014. At that time, Dario Amodei, the CEO of Anthropic, was also interning at Baidu, and it was there that he discovered the prototype of the Scaling Law.
Yan Junjie said that the Scaling Law was actually discovered in 2014 when doing speech recognition, but it was not widely recognized until around 2020. "It had existed six years ago, and it happened in a Chinese company, so what happened later is a bit regrettable."
This past event made Yan Junjie realize that China was not without opportunities but missed the opportunity to transform technological insights into industrial advantages.
The reality is cruel. Yan Junjie is very clear about the gap between China and the United States. He calculated that the valuation of the best US companies is 100 times that of Chinese startups, and their revenue is also basically 100 times, but the technology may only be 5% more advanced, and they spend about 50 to 100 times more money.
Then why can Chinese companies achieve similar results by spending 1/50 of their money, with only a 5% gap? The core reason is that Chinese talents are still very good. And more importantly, China has a large gap in computing power compared with the United States, so it is necessary to use more innovative methods to achieve the same results. The principles may be the same, but there are many innovations in each module in terms of methods.
Limited computing power may not be a curse but can become a whip to force innovation.
This explains why MiniMax started exploring the MoE architecture first in 2023, why it dared to bet on the linear attention mechanism in 2025, and why it returned to the full attention mechanism in the M2 model.
Every technological choice is to find a triangular balance among quality, speed, and price under limited resources.
If the logic of DeepSeek is to "squeeze every bit of computing power through extreme engineering optimization", then MiniMax is leveraging greater possibilities in limited resources through algorithm breakthroughs and mechanism innovations.
One is steady and cautious, and the other takes an unconventional approach.
One of the remarkable innovations is the "Interleaved Thinking" proposed by MiniMax in the model inference mechanism, which allows the model to advance tasks in a cycle of "do things - stop and think - do things again".
This new mechanism quickly promoted the adaptation and support of mainstream foreign inference frameworks such as OpenRouter and Ollama, and also drove domestic models such as Kimi and DeepSeek to gradually supplement similar capabilities.
But behind these achievements, what is more worth asking is: how did a team without returnees from Silicon Valley, regarded as "grassroots" by the outside world, develop a globally leading model?
Yan Junjie's answer is unexpected.
AI is not a mystery but an engineering problem that can be disassembled by first principles. For example, how to design the algorithm, how to build the data link, and how to optimize the training efficiency. Each aspect has a very clear goal.
Based on this judgment, Yan Junjie gave up the search for "geniuses" and instead believed that scientific methodology can enable ordinary people to play extraordinary roles. He also mentioned that there are quite a few returnees in the company, but many of the classmates who really play a key role are basically in their first jobs.
On the wall of the MiniMax meeting room, there is a line of words - Intelligence with Everyone. This is the original intention of Yan Junjie's entrepreneurship and also the reason why many people choose to join MiniMax.
This line of words is becoming a reality today. Users from more than 200 countries and regions around the world are using MiniMax's multi - modal model. Among them, there are 212 million individual users, and more than 100,000 enterprises and developers are creating more products and services.
The AI helmsman of non - geniusism
If the non - consensus on the technological path is explicit, then Yan Junjie's own growth trajectory is a practice of "anti - fragility".
Yan Junjie comes from a small county in Henan. He developed extremely strong self - learning ability in an environment with extremely scarce resources.
When he was in primary school, he would read a lot of books, and these books might not be suitable for people of that age. For example, he read many high - school or even university books in advance when he was in primary school. His father taught junior high school, so he started reading junior - high - school materials. When he was in junior high school, he started reading high - school materials, and when he was in high school, he started learning calculus. No one taught him these things; he just read by himself.
Self - learning junior - high - school knowledge in primary school and calculus in high school - this characteristic of learning ahead of time without being restricted by the environment has run through Yan Junjie's entire entrepreneurial career. When others were waiting for guidance from their tutors, he had already analyzed problems by himself using first principles; when others were complaining about the lack of resources, he had already made up for the gap through his extreme self - learning ability.
But self - learning ability doesn't mean everything goes smoothly. This has something to do with the "cruel training" he received at SenseTime. At that time, he began to realize that if he really wanted to develop the best product, he chose to work on face recognition. It took about a year and a half to go from the bottom to the top.
This year and a half was very painful. He always ranked among the last few in each technical test. This kind of suffering was enough to break most people. But Yan Junjie didn't give up. Instead, he extracted the core methodology from this experience: one must make choices and choose something that can bring about fundamental and long - term changes rather than making minor repairs.
After this experience, the most important thing is to have confidence in one's own most fundamental judgments.
This experience forged two key characteristics in Yan Junjie: one is the extreme ability to make choices, willing to give up short - term repairs and focus on long - term breakthroughs; the other is high psychological resilience, able to withstand long - term failure and doubts.
These two characteristics are precisely what enable MiniMax to maintain the "Buddha - like" determination to adhere to non - consensus on the technological path and allow Yan Junjie to remain calm in difficulties such as the Silicon Valley Bank crisis and model training failures.
The third way for Chinese AI
As the story of MiniMax unfolds, a bigger question naturally emerges: when talent cultivation takes time and technological catch - up requires a cycle, what can Chinese AI companies rely on to establish their own living space right now?
MiniMax may not be the standard answer, but Yan Junjie has three principles that he has always adhered to since the start of his business:
First, focus on users rather than projects; second, develop both the domestic and overseas markets simultaneously.
In 2022, while large domestic companies were still waiting to see if it was worth investing in AI, startups generally chose the ToB path (doing projects and selling solutions) to achieve quick cash flow. But Yan Junjie chose the most difficult path: ToC, and targeted the global market from the very first day.
Therefore, Yan Junjie chose to polish his technology in the more intense competition overseas rather than getting involved in the traffic competition with giants in the domestic market. Facts have proved that this was the right choice - MiniMax's DAU and payment rate in the overseas market have been maintained in a healthy range, which is becoming its moat.
But the most difficult one is the third principle: technology - driven vs. user growth.
This is the ultimate test for all AI startups. Yan Junjie admitted that he was also entangled but finally chose the former, even though it meant sacrificing short - term data, the loss of middle - level employees, and doubts from the outside world.
Promoting product and business development through model capabilities or through the growth methods of the mobile Internet era may both be correct, but they cannot coexist. Finally, we found that the technology - driven approach is more suitable for us.
Under the technology - driven strategy, Yan Junjie made another key choice: open - source.
Shortly after DeepSeek R1 emerged at the beginning of the year, Yan Junjie once said that if he could choose again, he should have made it open - source from the first day. He mentioned open - source again in the conversation with Luo Yonghao. In fact, the same thing has happened in the field of mobile operating systems. Apple's system is closed - source, while Android's is open - source. Those behind the second - place company must open - source to have their own unique positioning and develop a new ecosystem.
In order to make progress, we need to give others a reason to choose us. The openness of the model is exactly a very important reason because it can make people have strong technical trust, understand our R & D ability, and be willing to cooperate more deeply.
And MiniMax continues the open - source wave set off by DeepSeek. After the release of MiniMax M2, the large - model analysis platform Artificial Analysis introduced it like this:
Chinese AI laboratories continue to maintain a leading position in the open - source field. MiniMax's release continues the leading position of Chinese AI in the open - source field, which was initiated by DeepSeek at the end of 2024 and maintained by subsequent releases of DeepSeek, Alibaba, Zhipu, and Kimi.
Recently, the global model aggregation platform OpenRouter jointly released a report "State of AI's 100 Trillion Tokens" with a16z. It can be seen that after the M2 model was open - sourced, it was quickly welcomed and adopted by global developers.
The proportion of the global usage of Chinese open - source models has soared from 1.2% at the beginning of 2024 to 30% now. The focus of the global open - source ecosystem has shifted to China.
But this competition is far from over. Yan Junjie's judgment is that the physical limitations of computing power and chips determine that there is a ceiling for the model's parameter quantity and cost. With a limited number of parameters, different people will make different choices, and there will definitely be different results.
AI will not be monopol