HomeArticle

DeepSeek V4 is finally released, but there are still no answers to the five subjective questions it has left.

阿菜cabbage2026-04-24 13:20
DeepSeek is the starting point for Chinese AI to enter the global first-class level, but it won't be the end.

Text by | Zhou Xinyu

Data compilation by | Zhong Chudi

Edited by | Su Jianxun, Yang Xuan

The boots have finally dropped.

DeepSeek V4, which has been joked about as "Next Week" for nearly three months, has finally revealed its true self.

With a maximum parameter count of 1.6T, a context window of 1M, performance optimization for Agents, and the use of MoE (Mixture of Experts) and the sparse attention mechanism DSA to reduce computational and memory requirements - these parameters and performance aspects that were widely speculated about by the outside world have been finalized with the official announcement of V4.

Results of the performance evaluation of DeepSeek V4.

The reason for its late arrival is related to the migration of V4's training framework from NVIDIA to Huawei Ascend, as well as changes in DeepSeek's internal decision - making. We learned that in mid - 2025, DeepSeek faced a relatively serious training failure.

"At that time, DeepSeek was facing the problem of re - adapting to the chips," an insider mentioned. "There were also disagreements within the company regarding the training direction. Liang Wenfeng put forward some of his own requirements, but it was difficult to reach a compromise at the implementation level."

However, different from the outside world's speculation that the new model supports multi - modal generation and understanding, V4 is still a language model. The decision to postpone the training strategy for multi - modal generation is mainly due to constraints in computing power and cash.

Several insiders told "Intelligent Emergence" that DeepSeek opened its external financing window in mid - April 2026. The internal trigger was that DeepSeek needed more funds to support the training of models with larger parameter scales and to retain and recruit more top - level talents.

"Compared with the models of top - tier manufacturers such as OpenAI and Anthropic, the 1.6T parameter count does not have absolute competitiveness," a practitioner told us. Soon, domestic model manufacturers will also release models with a parameter scale of 3T.

In terms of talent, as talents such as Guo Daya (core author of DeepSeek R1) and Wang Bingxuan (core author of DeepSeek LLM) were poached by large companies like ByteDance and Tencent, DeepSeek needed a large - scale financing to stabilize the morale of its employees and recruit new talents.

Regarding the external trigger for turning to open financing, several industry insiders speculated that it was related to Tencent's investment attitude. Before starting the financing, Liang Wenfeng and Ma Huateng had several discussions about exclusive investment. However, two relevant sources revealed that Liang Wenfeng did not agree to the condition of giving Tencent a 20% stake.

Since the release of R1, there has been an obvious change: DeepSeek has been forced to quickly transform from a non - profit, idealistic technological utopia into a pragmatic company that values products and commercialization.

On April 8, 2026, the DeepSeek App was revised and launched the "Expert Mode" for complex reasoning and the "Fast Mode" for simple tasks. With the release of V4, we also learned that V4 - pro with 1.6T parameters is responsible for the "Expert Mode", while V4 - flash with 284B parameters supports the "Fast Mode".

The two modes of the DeepSeek App.

An insider once said that since the second half of 2025, Liang Wenfeng has started to pay more attention to product refinement. Several AI product managers from large companies told "Intelligent Emergence" that at the end of 2025, DeepSeek carried out a large - scale recruitment of product strategists/managers, and they received multiple contacts from DeepSeek's HR.

An industry insider also revealed to "Intelligent Emergence" that DeepSeek has established several innovative product teams internally to explore Agent and other C - end product forms.

Judging from the updated version, DeepSeek's text processing ability has been significantly improved. In the past year, we have also heard many AI industry HRs and headhunters mention that DeepSeek's HRs have met students in the dormitories of Peking University's Chinese Department more than once to add their WeChat accounts.

The purpose of recruiting Chinese majors is to conduct data annotation in the humanities field and establish evaluation standards. This is seen as a signal that DeepSeek attaches importance to the humanistic nature of its models.

Although "inclusiveness" and "openness", with a simple Chat interface as its product, are the images that DeepSeek presents to the outside world. However, we learned that in 2025, DeepSeek never stopped exploring products and commercialization - currently, an internal product team of dozens of people has been established to explore product forms such as Agent.

Even earlier, in 2024, before it became popular, DeepSeek considered investment promotion, but the idea was quickly rejected by Liang Wenfeng.

DeepSeek has finally released its annual update, like the falling of the Sword of Damocles, which has slightly eased the concerns of model manufacturers in China and around the world.

Since 2026, DeepSeek's annual iterations have become the "boy who cried wolf" story in the AI world. Avoiding DeepSeek has become the standard action of model manufacturers in recent months.

Two newly listed large - model manufacturers, Zhipu and MiniMax, released their new models GLM 5 and M 2.5 before the Spring Festival to avoid the peak.

An employee of Zhipu told "Intelligent Emergence" that as soon as the rumor that "DeepSeek will release a model during the Spring Festival" spread, the algorithm team immediately held a meeting and demanded to release GLM 5 "as early as possible".

An employee of MiniMax also said that in mid - January, before the hangover from the celebration of the Hong Kong stock IPO had faded, the algorithm team voluntarily returned to their workstations early.

"Avoiding the peak" is particularly important for these two model startups that have gone public. "If we release the model later than DeepSeek and the performance is not as good, it will affect the stock price; but if we don't release it, it will also affect the stock price," the above - mentioned employee said. "The way to minimize the impact is to release it earlier."

Model companies also need to carry out their financing actions before DeepSeek's update.

Jieyue Xingchen, which announced its Series B+ financing at the end of January, is also eager to close this round of financing before the Spring Festival. An insider told us that once DeepSeek makes another unexpected move, the communication cost with investors will be very high.

In the eyes of practitioners, there have always been "two DeepSeeks" at the table - one brings the fear of being overshadowed, and the other leads as a paradigm. In the two - year period when model manufacturers were in a slow - paced state, the industry needs such an "uncertainty factor" to make manufacturers reflect and then sprint forward.

An employee of MiniMax remembered that in the internal letter and all - staff meeting after the Spring Festival, the founder and CEO Yan Junjie mentioned: "DeepSeek has helped us find a path that I want to take."

Even though Chinese AI practitioners have complex feelings towards DeepSeek, people still admit that DeepSeek has changed many rules in the Chinese AI industry.

Change often means tearing down and rebuilding, which is definitely not a comfortable experience. But as an investor of the "Six Little Tigers" evaluated for us: DeepSeek has laid the organizational culture and R & D focus of Chinese large - models in the past year. After that, "it is the starting point for Chinese AI to become a global first - class player, but not the end."

DeepSeek has brought the competitive landscape of the Chinese AI industry into a relatively stable middle stage. However, in the early stage of model technology, DeepSeek has not left only consensus in the industry. As commercialization and competition pressure increase, manufacturers are moving towards different forks around propositions such as open - source, commercialization, and growth.

Before the release of DeepSeek V4, we had conversations with more than a dozen AI industry insiders around the question "What has DeepSeek changed in the Chinese AI industry?"

Below are the 5 new propositions in the "post - DeepSeek era" that we summarized from these conversations.

Proposition 1: Re - evaluate the cost - effectiveness of open - source

A year ago, after DeepSeek R1 released its technical report, an AI investor's judgment was that returning to the research of the base model and building a technical brand through open - source and openness was the most important thing for model manufacturers.

However, now he told us that his previous judgment was debatable.

After following DeepSeek for a year, whether the era when manufacturers strongly supported the open - source and research ecosystem is coming to an end has become a key issue, especially after Lin Junyang, the technical leader of Alibaba's Qianwen large - model, left the company recently.

In a sense, Qwen led by Lin Junyang represents the interests of the open - source ecosystem. However, now it has a sharp contradiction with Alibaba's profitability as a commercial company.

"The golden age of non - profit is over." A Qwen employee evaluated this event in this way.

What makes manufacturers waver is that the two model manufacturers with the highest revenue now are taking the closed - source route - OpenAI has an annualized revenue of over $25 billion, and Anthropic has an annualized revenue of over $19 billion (according to The Information, data as of the end of February 2026).

As for the model revenue of domestic manufacturers, the recently disclosed 2025 financial reports show that MiniMax's total annual revenue was $79.038 million, and Zhipu's was 724 million yuan (about $105 million), which is still several orders of magnitude behind OpenAI and Anthropic.

△ The annualized revenue of OpenAI and Anthropic since 2023. Source: The Information

At the AGI Next Conference in January 2026, Tang Jie, the founder of Zhipu, also issued a warning: "We may just be having fun in the 'open - source playground', while the closed - source models in the United States have already entered the next era."

There is no doubt that the open - source and open ecosystem driven by DeepSeek has enabled Chinese models to quickly establish global recognition and technical reputation in 2025.

However, a cruel fact is that the stage of quickly "cold - starting" and building technical reputation through open - source has passed. At present, when base - model R & D still consumes a large amount of money, how to convert the reputation into real money is a more important survival proposition.

It is time to re - evaluate the value of open - source.

Proposition 2: The investment promotion war pauses, and refined investment begins

How to interpret DeepSeek's achievement of "0 investment promotion and over 100 million users in 7 days after the App went online"?

A year ago, the industry's attention would have involuntarily focused on "0 investment promotion" - this out - of - the - circle narrative overturned the growth paths that many manufacturers firmly believed in and punctured the false prosperity created by model products at that time.

Alarm and stress response. At the beginning of 2025, many companies made equally radical reflections as large - scale investment promotion.

A typical example is Yuezhianmian, which started the investment promotion war.

As reported by "Intelligent Emergence", at a strategic meeting that lasted for five or six hours in February 2025, Zhang Yutong, the co - founder of Yuezhianmian, announced to immediately suspend the investment promotion of Kimi on the Android channel and reduce the investment promotion budget on the iOS channel from tens of millions of yuan per day to tens of thousands of yuan per day.

A middle - level manager of the "Six Little Tigers" once hypothesized to us that the radical investment promotion war of AI applications, mainly involving Kimi and Doubao, would probably last until Q2 2025. According to an average investment promotion expenditure of $200 million per quarter, Yuezhianmian would be the first to lose due to financial pressure.

As the stress - response emotions gradually return to rationality, most of the growth members of manufacturers told us that investment promotion still needs to continue, but it should be a smart and targeted growth strategy.

In fact, the radical investment promotion and subsidy war has not stopped because of DeepSeek's atypical success. However, the main participants now are several large companies with strong financial resources and traffic entrances.

The most intense scene of the growth war occurred during the Spring Festival in 2026. Alibaba's Qianwen spent 3 billion yuan to treat users to milk tea, Tencent's Yuanbao distributed 1 billion yuan in red envelopes, and ByteDance spent 1 billion yuan to promote Doubao on the Spring Festival Gala stage.

A member of the growth team of the "Six Little Tigers" described the current investment promotion as "a clever woman trying to cook without rice": "The traffic entrances are in the hands of large companies, which means that the remaining model manufacturers need to use more refined growth methods, give up building a large - scale awareness, and focus on target users."

He gave an example. If the main scenarios of an AI product are in finance and legal office work, then promoting the product on some financial apps would be more cost - effective.

Proposition 3: Return to the base model: Choose practicality or research?

After R1 became popular, focusing on base - model R & D became the consensus of AI model manufacturers overnight.

"We are more determined about our research direction," a former researcher from Yuezhianmian who witnessed the release of R1 told us. "R1 is not a groundbreaking innovation, but it proves that as long as the manufacturer makes the right judgment on the general direction and adheres to its own path, it can get positive feedback in terms of performance, just like DeepSeek has always adhered to pure language and reasoning."

Previously, in order to rank on the leaderboard or chase hot topics, many manufacturers would train models focusing on different performances such as reasoning and dialogue separately.

"Doing so can optimize a certain ability, but the practicality of the model will be compromised, and customers may not necessarily buy it," an employee of Zhipu told us. He mentioned that a phenomenon that alarmed Zhipu was that after the release of R1, many leading customers in the industry switched to deploying DeepSeek.

At that time, Zhipu made a decision that the above - mentioned employee considered "difficult but correct": to train a model, GLM 4.5, that simultaneously focuses on reasoning, coding, and Agentic abilities.

"This is Zhipu's first'model against the leaderboard'. The performance optimization direction comes from the real needs of customers," he said. "<