DeepSeek verabschiedet sich von der Ära des "einsamen Helden"
The release of DeepSeek V4 is confirmed, while the financing is still in the stage of media reports and transaction rumors.
DeepSeek's official website already indicates the launch of "DeepSeek - V4 Preview" and claims that it has stronger agent capabilities and top - notch inference capabilities and can already be used on websites, apps, and via APIs.
DeepSeek launched the preview of V4 on April 24, which includes the Pro and Flash versions and will replace V3, which was released in December 2024.
Regarding the financing, the official statement so far is still "under negotiation". Tencent and Alibaba are in negotiations to invest in DeepSeek, with the company's value exceeding $20 billion. It is emphasized that neither DeepSeek nor Tencent and Alibaba have responded immediately.
The financial magazine "Caixin" further reports that DeepSeek is in negotiations for financing, with the investors being Tencent and Alibaba. Both companies plan to invest a total of $1.8 billion together, and the company's value is estimated to be about $20 billion. However, the transaction concept has not been finalized yet.
What does the release of V4 mean?
The last time DeepSeek really changed the industry history was during the transition phase from V3 to R1. At that time, it achieved several things simultaneously: low cost, high performance, open - source weights, and inference models.
After R1, the real test for DeepSeek was whether it could continue to develop next - generation products.
Many technology companies only experience their peak once. A successful model, in - depth media coverage, or a shock on the capital market can bring a company into the spotlight. However, the difficult part is to continue to innovate even after the publicity fades.
The release of V4 shows at least that DeepSeek has not stopped at R1. It has made a name for itself with low costs and continues to develop its flagship model.
Moreover, the focus of V4 goes one step further than just "affordability".
According to DeepSeek's official Hugging Face page, the V4 series includes two MoE models: V4 - Pro has a total of 1.6T parameters and 49M active parameters; V4 - Flash has a total of 284M parameters and 13M active parameters. Both models support a context of 1 million tokens.
This shows that DeepSeek is shifting the competitive focus from pure cost - efficiency to more complex tasks.
All these capabilities point in the same direction: the execution of more complex tasks. A long context enables the model to process longer data and continuous tasks. The "Thinking Mode" corresponds to complex inference processes, while "Tool Calls" and JSON outputs are more suitable for connecting to external systems and stable use on the application side. This is also the reason why V4 is discussed in the context of agents.
Looking more closely, the real value of V4 lies in the fact that DeepSeek continues to focus on "efficiency". DeepSeek not only wants the model to be able to read longer texts but also to do so more cost - effectively.
The next competitive stage for large models will not only be played out on the rankings. The rankings answer the question of "who is stronger", but in the real business world, one still has to ask: Who can provide these capabilities stably, cost - effectively, and on a large scale?
It is one thing for a model to be able to read 1 million tokens. Whether a large number of users, developers, and enterprise customers can use it at an affordable price is another.
If the long context only remains in the demonstration phase, it is more of a technical presentation. However, if the long context can be cost - effectively integrated into APIs, enterprise applications, and agent workflows, it becomes an infrastructure capability.
Therefore, V4 focuses on DeepSeek's core strategy: not simply increasing parameters but further improving system efficiency.
V4 - Pro is more of a high - performance version suitable for complex inferences, long contexts, and more demanding tasks. V4 - Flash is more of a version for mass use, covering speed, cost, and frequent use cases.
This shows that DeepSeek takes into account the service forms for different users, scenarios, and cost structures.
The development of websites, apps, and APIs is carried out simultaneously. Pro and Flash are offered at the same time. Long contexts, inferences, and tool calls are equally emphasized. DeepSeek is preparing to become an actor that has to handle real applications, real uses, and real commercial pressures.
Why is DeepSeek seeking financing?
More precisely, the old financial structure is no longer suitable for the new competitive phase.
In the past, DeepSeek was special because it had the financial, computing power, and engineering capabilities of Magic Square Quant/High - Flyer. Therefore, it could maintain a "non - VC narrative" for a long time: it was not in a hurry to finance, commercialize, or tell growth stories.
There are reports that DeepSeek has rejected financing proposals from top Chinese VC firms and technology giants several times in the past.
But in the V4 phase, the situation has changed.
The costs for training and inference are increasing.
During the V3 era, DeepSeek impressed the market the most by developing a strong model with extremely low training costs. The technical report on V3 states that DeepSeek - V3 is a MoE model with a total of 671M parameters and 37M active parameters. The pre - training data is 14.8T tokens, and the full training only took 2.788M H800 GPU hours.
But now V4 has a total of 1.6T parameters, over 32T pre - training tokens, a context of one million tokens, and agent capabilities.
As it progresses to this scale, even if DeepSeek is good at engineering optimization, it cannot be completely independent of capital expenditures. On the inference side, it is especially a long - term cost problem: the more users, the more calls, and the more affordable the API, the more obvious the loss pressure will be.
Another factor that cannot be ignored is the adaptation to Chinese computing power.
Before and after the release of V4, the news about the cooperation between DeepSeek and Huawei Ascend has increased significantly. The Ascend Supernode based on the Ascend 950 AI chip will fully support DeepSeek V4.
In a broader sense, it is about the security basis for the next phase of Chinese large models: if model companies want to develop further, they must consider the adaptation to Chinese computing power, cloud services, and software stacks.
Moreover, the competition for talents has become a fixed cost factor.
The financial magazine "Caixin" reports that one of the reasons for DeepSeek to open up to financing is to prevent talent loss. Since 2025, the Chinese artificial intelligence talent market has reached the stage of "extravagant competition for core researchers". DeepSeek was previously the strongest in the "small, high - density, research - oriented" group, but when industry giants recruit talents with cash, options, computing power, data, and product implementation possibilities, it is difficult to defend in the long term.
To maintain the iteration speed of the model, DeepSeek must upgrade the incentives for talents from "project awards + research freedom" to "long - term interest binding".
DeepSeek also needs an ecosystem entry, not just model capabilities.
R1 made DeepSeek well - known, but the real problem is: Who will take over the traffic, computing power, API, developers, enterprise customers, agent applications, office scenarios, and cloud market access after the fame?
Almost all large companies stand behind OpenAI. Amazon and Google stand behind Anthropic. Gemini belongs to the listed giant Google itself. xAI has the traffic and capital of the Musk Group.
Chinese large companies have their own clouds, apps, office software, search engines, payment systems, content ecosystems, and enterprise customers. If DeepSeek remains completely independent, it can be one of the strongest open - source models, but it will be difficult to complete the last mile from a model to an industry infrastructure alone.
DeepSeek must convert its "model advantages" into "ecosystem advantages".
Why Tencent and Alibaba?
I analyzed in February 2025 that if DeepSeek really needs the money from large companies, Tencent and Alibaba are the most logical providers. Now this idea is even stronger. Which company will DeepSeek ultimately choose?
Tencent is the most suitable strategic partner with "low intervention, high dissemination, and many use cases".
The value of Tencent for DeepSeek lies in three things: entry, use cases, and organizational patience.
Tencent has WeChat, QQ, Tencent Meeting, Enterprise WeChat, Tencent Docs, Tencent Cloud, games, content, and a mini - program ecosystem. If DeepSeek wants to develop agents, the most difficult part is not the model itself but whether the agents can be integrated into the real work process of users. Tencent's use cases are a natural test site for agents.
More importantly, Tencent has a relatively special reputation in terms of its investments. Although it has strategic claims, compared with many other large companies, Tencent is more used to making "ecosystem investments" rather than "acquisition - type investments". The examples of Meituan, JD.com, and Pinduoduo show that Tencent has accepted in the past that the invested companies can maintain considerable independence.
For a company like DeepSeek, which highly values the research and control culture, this is very important.
DeepSeek does not have to compete with Tencent. Tencent naturally also develops Hunyuan itself, but its strength lies in integrating the best model capabilities into its applications and the cloud ecosystem.
If DeepSeek gets Tencent's money, it finds a super - distributor as a partner.
Alibaba is the most suitable strategic partner for "Cloud + Open - Source + Developer Ecosystem".
The value of Alibaba for DeepSeek lies more in the infrastructure.
Alibaba Cloud is one of the most important cloud - computing platforms in China, and Tongyi Qianwen/Qwen is one of the strongest players in the Chinese open - source model ecosystem. Alibaba has made an "industry investment" in the AI industry in the past: Moonlight AI, MiniMax, Zhipu, Baichuan, Lingyi Wanwu, and others have been included in Alibaba's investment portfolio.
For Alibaba, if DeepSeek remains strong, it brings several advantages:
It increases the attractiveness of Alibaba Cloud in the model market;
It supplements Alibaba's external ecosystem in terms of top - notch inference models;
Together with Qwen, it forms a "self - development + strongest external open - source system" as a double support;