DeepSeek Bids Farewell to the "Lone Heroes" Era

The release of V4, the entry of Tencent and Alibaba. Liang Wenfeng's idealism has reached the door of realism.

The release of DeepSeek V4 is confirmed, while the financing is still in the stage of media reports and transaction rumors.

DeepSeek's official website has shown that "DeepSeek-V4 Preview" is online, stating that it has stronger Agent capabilities and top - level reasoning capabilities, and can be used on web pages, apps, and APIs.

DeepSeek started the preview of V4 on April 24, including two versions, Pro and Flash, which will replace V3 released in December 2024.

Regarding the financing, as of now, the public statement is still "under negotiation". Tencent and Alibaba are in talks to invest in DeepSeek, with a valuation of over $20 billion. It is also emphasized that DeepSeek, Tencent, and Alibaba have not responded immediately.

"Caixin" further reported that DeepSeek is in talks for financing, with Tencent and Alibaba as the investors. The two are expected to invest a total of $1.8 billion, with a valuation of about $20 billion, but the transaction plan has not been fully finalized.

What does the release of V4 mean?

The last time DeepSeek truly rewrote the industry narrative was from V3 to R1. At that time, it achieved several things simultaneously: low cost, high performance, open - source weights, and inference models.

After R1, the real test for DeepSeek is whether it can continue to deliver the next - generation products.

Many technology companies only have one moment of glory. A blockbuster model, a media - breaking event, or a shock in the capital market can push a company into the spotlight. But the real challenge is whether it can continue to iterate after the spotlight fades.

V4 at least shows that DeepSeek didn't stop at the R1 moment. It made a name for itself with low - cost surprise attacks and is still iterating its flagship model.

Moreover, the focus of V4 this time has taken a step further from "being cheap".

According to DeepSeek's official Hugging Face page, the V4 series includes two MoE models: V4 - Pro has a total of 1.6T parameters and 49B active parameters; V4 - Flash has a total of 284B parameters and 13B active parameters. Both support a 1 - million - token context.

This shows that DeepSeek is shifting the focus of competition from simple cost - effectiveness to more complex task scenarios.

All these capabilities point in the same direction: more complex task execution. The long - context allows the model to handle longer materials and continuous tasks. The Thinking Mode corresponds to complex reasoning, and Tool Calls and JSON output are more suitable for connecting to external systems and being stably called by the application layer. This is why V4 is discussed in the context of Agent.

Looking further, the truly important aspect of V4 is that DeepSeek still focuses on "efficiency". DeepSeek wants the model to read longer texts and also wants to achieve this at a lower cost.

The next - stage competition of large models will not only take place on the leaderboard. The leaderboard answers the question of "who is stronger", but the real business world also needs to ask: who can deliver this ability stably, at low cost, and on a large scale?

It's one thing for a model to be able to read 1 million tokens, and it's another thing for a large number of users, developers, and enterprise customers to be able to call it at an affordable price.

If the long - context only stays at the demonstration stage, it's more like a technical show - off. If the long - context can be made cost - effective and integrated into APIs, enterprise applications, and Agent workflows, it will become an infrastructure capability.

So V4 continues DeepSeek's core approach: not simply piling up parameters but continuing to squeeze system efficiency.

V4 - Pro is more like a version with an upper - limit of capabilities, used to undertake complex reasoning, long - context, and higher - difficulty tasks; V4 - Flash is more like a version for large - scale calls, used to cover speed, cost, and high - frequency scenarios.

This shows that DeepSeek has considered service forms under different users, different scenarios, and different cost structures.

Web, app, and API are promoted simultaneously; Pro and Flash appear at the same time; long - context, reasoning, and tool calls are all emphasized. DeepSeek is preparing to become a player that needs to undertake real applications, real calls, and real commercialization pressure.

Why does DeepSeek need to raise funds?

More precisely, the old funding structure is no longer suitable for the new competition stage.

In the past, the uniqueness of DeepSeek was that it had the financial, computing power, and engineering accumulation of Magic Square Quant/High - Flyer behind it. So it could maintain a "non - VC narrative" for a long time: not in a hurry to raise funds, not in a hurry to commercialize, and not in a hurry to tell a growth story.

It is said that DeepSeek has rejected multiple financing proposals from top Chinese VCs and technology giants in the past.

But in the V4 stage, the situation has changed.

Both training and inference costs are rising.

In the V3 era, what shocked the market most about DeepSeek was that it created a powerful model with extremely low training costs. The V3 technical report states that DeepSeek - V3 is a MoE model with a total of 671B parameters and 37B active parameters. The pre - training data is 14.8T tokens, and the complete training only used 2.788M H800 GPU hours.

But V4 has become a model with a total of 1.6T parameters, over 32T pre - training tokens, a million - token context, and Agent capabilities.

As it continues to develop at this scale, even if DeepSeek is good at engineering optimization, it cannot completely avoid capital expenditure. The inference side is a long - term cost black hole: the more users, the more calls, and the cheaper the API, the more obvious the loss pressure.

Another variable that cannot be ignored is the adaptation of domestic computing power.

Before and after the release of V4, the news about DeepSeek and Huawei Ascend has increased significantly. The Ascend Supernode based on the Ascend 950 AI chip will fully support DeepSeek V4.

Looking deeper, it is related to the safety cushion for the next stage of Chinese large models: if model companies want to continue to iterate, they must consider the adaptation issues of domestic computing power, cloud services, and software stacks.

In addition, the competition for talent has become a hard cost.

"Caixin" reported that one of the reasons for DeepSeek to open up financing is to prevent talent loss. After 2025, the Chinese AI talent market has entered the stage of "sky - high competition for core researchers". DeepSeek was previously known for its "small team, high density, and research - oriented" approach, but once industry giants start to poach talent with cash, options, computing power, data, and product implementation opportunities, it is difficult to defend for a long time.

If DeepSeek wants to maintain the model iteration speed, it must upgrade talent incentives from "project honor + research freedom" to "long - term interest binding".

DeepSeek also needs an ecological entrance, not just model capabilities.

R1 made DeepSeek well - known, but the real question is: after becoming well - known, who will undertake the traffic, computing power, APIs, developers, enterprise customers, Agent applications, office scenarios, and cloud market entrances?

Almost all major domestic companies are involved behind OpenAI. Amazon and Google are behind Anthropic. Gemini is under the listed giant Google. xAI has the traffic and capital of the Musk ecosystem.

Domestic large companies have their own clouds, apps, office software, search engines, payment systems, content ecosystems, and enterprise customers. If DeepSeek continues to be completely independent, it can be one of the strongest open - source models, but it will be difficult to complete the last mile from the model to the industrial infrastructure alone.

DeepSeek needs to convert its "model advantage" into an "ecological advantage".

Why Tencent and Alibaba?

I analyzed in February 2025 that if DeepSeek has to take money from large companies, Tencent and Alibaba are indeed the two most reasonable choices. Now, this logic is even stronger. Which company's money will DeepSeek ultimately take?

Tencent is most suitable to be a strategic shareholder with "low intervention, high distribution, and high - scenario involvement".

The value of Tencent to DeepSeek lies in three aspects: entrance, scenario, and organizational patience.

Tencent has WeChat, QQ, Tencent Meeting, Tencent Workplace, Tencent Docs, Tencent Cloud, games, content, and mini - program ecosystems. If DeepSeek wants to develop an Agent, the most difficult part is not the model itself, but whether the Agent can enter the real - user workflow. Tencent's scenarios are a natural test field for Agents.

More importantly, Tencent's historical reputation in investment is indeed relatively special. It is not without strategic requirements, but compared with many large companies, Tencent is more used to making "ecological investments" rather than "acquisition - type investments". Cases such as Meituan, JD.com, and Pinduoduo show that Tencent has been able to accept the investee companies to maintain a considerable degree of independence in the past.

For a company like DeepSeek that attaches great importance to research culture and control rights, this is very important.

DeepSeek doesn't need to compete with Tencent. Tencent also has its own Hunyuan model, but Tencent's real strength lies in integrating the best model capabilities into its own applications and cloud ecosystem.

If DeepSeek takes Tencent's money, it is finding a super - distribution shareholder for itself.

Alibaba is most suitable to be a strategic shareholder for "cloud + open - source + developer ecosystem".

Alibaba's value to DeepSeek is more infrastructure - oriented.

Alibaba Cloud is one of the most important cloud - computing platforms in China, and Tongyi Qianwen/Qwen is one of the strongest players in the domestic open - source model ecosystem. Alibaba previously adopted a "buy - the - track" strategy in the AI field: companies such as Yuezhianmian, MiniMax, Zhipu, Baichuan, and Lingyiwanwu have all entered Alibaba's investment portfolio.

For Alibaba, if DeepSeek continues to be strong, it will bring several values:

Enhance the attractiveness of Alibaba Cloud in the model market;

Complement Alibaba's external ecosystem in the top - level inference models;

Form a dual - pivot structure of "self - developed + the strongest external open - source system" with Qwen;

Prevent DeepSeek from completely siding with Tencent or other large companies.

When DeepSeek R1 was released, a series of distill models used Qwen and Llama as the base, such as the DeepSeek - R1 - Distill - Qwen series. This shows that Qwen has become an important base in the domestic open - source ecosystem.

DeepSeek's distill model system has used Qwen as one of the important bases, which creates a natural intersection in the open - source ecosystem between Alibaba and DeepSeek.

Why not ByteDance?

ByteDance certainly has money, traffic, models, and products. But precisely because it has everything, it may not be the most suitable shareholder for DeepSeek.

ByteDance has heavily invested in Doubao, and Doubao is a very strong C - end AI product. If ByteDance invests in DeepSeek, there will be strong strategic synergy, but also stronger competitive tension. If DeepSeek wants to maintain an independent research route, it may not be willing to enter a large system with a strong self - developed model and a strong C - end product.

In contrast, Tencent and Alibaba are more like "complementary shareholders": Tencent provides entrances and scenarios, and Alibaba provides cloud and developer ecosystems. ByteDance is more like a "strong - competition shareholder": it can offer a lot, but it may also make DeepSeek's strategic independence more sensitive.

What does this round of financing really indicate?

In the past, DeepSeek was like an anti - common - sense example: no financing, no marketing, no rush for commercialization. Relying on a group of young researchers and extreme engineering efficiency, it suddenly disrupted the global AI narrative.

But after V4, it faces a different set of rules:

After V4, DeepSeek faces a different set of rules: open - source models require continuous investment, low - price APIs need long - term computing power support, Agent capabilities need scenario closure, million - token context requires inference infrastructure, and top - level talents also need capitalized incentives. Not to mention that Chinese AI companies also need to find a balance among chips, clouds, regulations, and the international environment.

So if this round of financing is successful, it shows that DeepSeek is starting to accept a reality: model capabilities can be achieved by a genius team once, but infrastructure wars cannot be won by a genius team alone in the long

该文观点仅代表作者本人，36氪平台仅提供信息存储空间服务。

DeepSeek bids farewell to the era of "lone heroes".

What does the release of V4 mean?

Why does DeepSeek need to raise funds?

Why Tencent and Alibaba?

What does this round of financing really indicate?