DeepSeek V4 has finally been released, but there are still no answers to the five subjective questions it left behind.
Text | Zhou Xinyu
Data compilation | Zhong Chudi
Editors | Su Jianxun, Yang Xuan
The boots have finally dropped.
DeepSeek V4, which has been joked about as "Next Week" for nearly three months, has finally revealed its true appearance.
With a maximum parameter count of 1.6T, a context window of 1M, performance optimization for Agents, and the use of MoE (Mixture of Experts) and the sparse attention mechanism DSA to reduce computing and memory requirements - these parameters and performance, which were widely speculated by the outside world, have been finalized with the official announcement of V4.
Performance evaluation results of DeepSeek V4.
The reason for its late arrival is related to the migration of V4's training framework from NVIDIA to Huawei Ascend, as well as internal decision - making changes at DeepSeek. We learned that in mid - 2025, DeepSeek faced a relatively serious training failure.
"At that time, DeepSeek was facing the problem of re - adapting to the chips," an insider mentioned. "There were also disagreements within the company regarding the training direction. Liang Wenfeng put forward some of his own requirements, but it was difficult to reach a compromise at the implementation level."
However, contrary to the outside world's speculation that the new model supports multimodal generation and understanding, V4 is still a language model. The decision to postpone the training strategy for multimodal generation is mainly due to constraints in computing power and cash.
Several insiders told Intelligent Emergence that DeepSeek opened its external financing window in mid - April 2026. The internal trigger was that DeepSeek needed more funds to support the training of models with larger parameter scales and to retain and recruit more top - tier talents.
"Compared with the models of top - tier manufacturers such as OpenAI and Anthropic, the 1.6T parameter count does not have an absolute competitive edge," a practitioner told us. Soon, domestic model manufacturers will also release models with a parameter scale of 3T.
In terms of talent, as talents such as Guo Daya (core author of DeepSeek R1) and Wang Bingxuan (core author of DeepSeek LLM) were poached by large companies like ByteDance and Tencent, DeepSeek needed a large - scale financing to stabilize the team and recruit new talents.
Regarding the external trigger for turning to open financing, several industry insiders speculated that it was related to the investment attitude of a certain large company. Before opening the financing, Liang Wenfeng and the top - level executive of that large company had several discussions about exclusive investment. However, two relevant sources revealed that Liang Wenfeng did not agree to the condition of giving up 20% of the shares.
Since the release of R1, there has been an obvious transformation: DeepSeek has been forced to quickly shift from a non - profit, idealistic technological utopia to a pragmatic company that values products and commercialization.
On April 8, 2026, the DeepSeek App was redesigned, launching the "Expert Mode" for complex reasoning and the "Fast Mode" for simple tasks. With the release of V4, we also learned that V4 - pro with 1.6T parameters is responsible for the "Expert Mode", while V4 - flash with 284B parameters supports the "Fast Mode".
Two modes of the DeepSeek App.
An insider once said that since the second half of 2025, Liang Wenfeng has started to pay more attention to product refinement. Several AI product managers from large companies told Intelligent Emergence that at the end of 2025, DeepSeek conducted an "open - door recruitment" for product strategists/managers, and they had received contacts from DeepSeek's HR several times.
An industry insider also revealed to Intelligent Emergence that DeepSeek has internally established several innovative product teams to explore Agents and other C - end product forms.
Judging from the updated version, DeepSeek's text ability has been significantly improved. In the past year, we have also heard many AI industry HRs and headhunters mention that DeepSeek's HRs have met students in the dormitories of Peking University's Chinese Department more than once to add their WeChat accounts.
The purpose of recruiting Chinese majors is to conduct data annotation in the humanities field and establish evaluation standards. This is seen as a signal that DeepSeek attaches importance to the humanistic nature of the model.
Although "inclusiveness" and "openness", with a simple product featuring only a chat interface, are the images that DeepSeek presents to the outside world. However, we learned that in 2025, DeepSeek never stopped exploring products and commercialization. Currently, an internal product team of dozens of people has been established to explore product forms such as Agents.
Even earlier, in 2024, before it became popular, DeepSeek considered investment promotion, but the idea was quickly rejected by Liang Wenfeng.
DeepSeek has finally released its annual update, like the falling of the Sword of Damocles, which has slightly relieved the anxiety of model manufacturers in China and even around the world.
Since 2026, DeepSeek's annual iteration has become a "crying wolf" story in the AI world. Avoiding DeepSeek has become the standard action of model manufacturers in recent months.
Two newly listed large - model manufacturers, Zhipu and MiniMax, released new models GLM 5 and M 2.5 before the Spring Festival to avoid the peak.
An employee of Zhipu told Intelligent Emergence that as soon as the rumor that "DeepSeek will release a model during the Spring Festival" spread, the algorithm team immediately held a meeting and required to release GLM 5 "as early as possible".
An employee of MiniMax also said that in mid - January, before the hangover from the celebration of the Hong Kong IPO had faded, the algorithm team voluntarily returned to work early.
"Avoiding the peak" is particularly important for these two model startup companies that have gone public. "If we release the model later than DeepSeek and the performance is inferior, it will affect the stock price. But if we don't release it, it will also affect the stock price," the above - mentioned employee said. "The way to minimize the impact is to release it earlier."
Model companies' financing actions also need to be carried out before DeepSeek's update.
Jieyue Xingchen, which announced its Series B+ financing at the end of January, was also eager to close this round of financing before the Spring Festival. An insider told us that once DeepSeek makes a new move, the communication cost with investors will be very high.
In the eyes of practitioners, there have always been "two DeepSeeks" at the table - one brings the fear of being overwhelmed, and the other serves as a paradigm leader. In the two - year period when model manufacturers have been moving slowly, the industry needs such an "uncertainty factor" to make manufacturers reflect and then sprint forward.
An employee of MiniMax remembered that in the internal letter and all - staff meeting after the Spring Festival, the founder and CEO Yan Junjie mentioned: "DeepSeek has helped us find a path that I want to take."
Even though Chinese AI practitioners have complex feelings towards DeepSeek, people still admit that DeepSeek has changed many rules in the Chinese AI industry.
Change often means destruction and reconstruction, which is definitely not a comfortable experience. But as an investor of the "Six Little Tigers" evaluated for us: DeepSeek has laid the organizational culture and R & D focus of Chinese large models in the past year. After that, "it is the starting point for Chinese AI to become a global first - class player, but not the end."
DeepSeek has brought the competitive landscape of the Chinese AI industry into a relatively stable middle stage. However, in the early stage of model technology, DeepSeek has not left only consensus for the industry. As commercialization and competition pressure intensify, model manufacturers are moving towards different forks around issues such as open - source, commercialization, and growth.
Before the release of DeepSeek V4, we had conversations with more than a dozen AI industry insiders around the topic of "What has DeepSeek changed in the Chinese AI industry?"
Here are the 5 new propositions in the "post - DeepSeek era" that we summarized.
Proposition 1: Re - evaluate the cost - effectiveness of open - source
One year ago, after DeepSeek R1 published its technical report, an AI investor's judgment was that for model manufacturers, returning to the research of the base model and building a technical brand through open - source and openness was the most important thing.
But now, he told us that the previous judgment needs to be re - examined.
After following DeepSeek for a year, is it time to end the era when manufacturers strongly support the open - source and research ecosystem? This key question has been put on the table with the recent departure of Lin Junyang, the technical leader of Alibaba's Qianwen large model.
In a sense, Qwen led by Lin Junyang represents the interests of the open - source ecosystem. However, now it has a sharp contradiction with Alibaba's profitability as a commercial company.
"The golden age of non - profit is over." A Qwen employee evaluated this event for us.
What shakes the manufacturers is the fact that the two model manufacturers with the highest revenue currently are taking the closed - source route - OpenAI has an annualized revenue of over $25 billion, and Anthropic has an annualized revenue of over $19 billion (according to The Information, data as of the end of February 2026).
As for the model revenue of domestic manufacturers, the recently disclosed 2025 financial reports show that MiniMax's annual total revenue was $79.038 million, and Zhipu's was 724 million yuan (about $105 million), which is still several orders of magnitude behind OpenAI and Anthropic.
△Annualized revenue of OpenAI and Anthropic since 2023. Source: The Information
At the AGI Next Conference in January 2026, Tang Jie, the founder of Zhipu, also issued a warning: "We may just be having fun in the 'open - source playground', while the closed - source models in the United States have already entered the next era."
There is no doubt that the open - source and open ecosystem driven by DeepSeek has enabled Chinese models to quickly establish global recognition and technical reputation in 2025.
However, a cruel fact is that the stage of quickly "cold - starting" and establishing technical reputation through open - source has passed. At present, when base model R & D still "burns money", how to convert reputation into real money is a more important survival proposition.
It's time to re - evaluate the value of open - source.
Proposition 2: The investment promotion war pauses, and refined investment promotion begins
How to interpret DeepSeek's achievement of "zero investment promotion and over 100 million users in 7 days after the App's launch"?
A year ago, the industry's attention would involuntarily focus on "zero investment promotion". This breakthrough narrative overturned the growth paths that many manufacturers firmly believed in and punctured the false prosperity created by model products at that time.
Alarm bells rang, and there was a stress response. At the beginning of 2025, many companies made equally radical reflections as large - scale investment promotion.
A typical example is Yuezhianmian, which kicked off the investment promotion war.
Intelligent Emergence reported that at a six - hour strategic meeting in February 2025, Zhang Yutong, the co - founder of Yuezhianmian, announced the immediate suspension of investment promotion for Kimi on the Android channel and reduced the investment promotion budget on the iOS channel from tens of millions of yuan per day to tens of thousands of yuan per day.
A middle - level manager of the "Six Little Tigers" once hypothesized for us: With Kimi and Doubao as the main players, the radical investment promotion war in AI applications will probably last until Q2 of 2025. According to an average investment promotion expenditure of $200 million per quarter, Yuezhianmian will be the first to lose due to financial pressure.
As the stress response gradually returns to rationality, most growth members of manufacturers told us that investment promotion should continue, but it should be smart and targeted growth.
In fact, the radical investment promotion and subsidy war has not stopped because of DeepSeek's atypical success. However, the main participants are now several large companies with strong financial resources and traffic entrances.
The most intense scene of the growth war occurred during the Spring Festival in 2026. Alibaba's Qianwen spent 3 billion yuan to treat users to milk tea, Tencent's Yuanbao distributed 1 billion yuan in red envelopes, and ByteDance spent 1 billion yuan to promote Doubao on the Spring Festival Gala stage.
A member of the growth team of the "Six Little Tigers" described the current investment promotion as "a clever woman trying to cook without rice": "The traffic entrances are in the hands of large companies, which means that the remaining model manufacturers need to adopt more refined growth methods, give up building overall awareness, and focus on target users."
He gave an example. If the main scenarios of an AI product are in finance and legal office work, then promoting the product on some financial Apps will be more cost - effective.
Proposition 3: Return to the base model: Choose practicality or research?
After R1 became popular, focusing on base model R & D suddenly became the consensus of AI model manufacturers.
"We are more determined about our research direction," a former researcher from Yuezhianmian who witnessed the release of R1 told us. "R1 is not an earth - shattering innovation, but it proves that as long as the general direction is correct and manufacturers stick to their own routes, they can get positive feedback in terms of performance, just like DeepSeek has always adhered to pure language and reasoning."
Previously, in order to rank on lists or chase hot spots, many manufacturers would separately train models focusing on different performances such as reasoning and dialogue.
"Doing so can optimize a certain ability, but the practicality of the model will be compromised, and customers may not necessarily pay for it," an employee of Zhipu told us. He mentioned that a phenomenon that alarmed Zhipu was that after the release of R1, many leading industry customers switched to deploying DeepSeek.
At that time, Zhipu made a "difficult but correct" decision in the eyes of the above - mentioned employee: to train a model, GLM 4.5, that simultaneously focuses on reasoning, coding, and agentic abilities.
"This is Zhipu's first 'anti - list' model, and the performance optimization direction comes from real customer needs," he