Collective Price Hikes: Large Language Models Begin Demanding "Money"

DeepSeek becomes a legend by cutting prices, while Doubao gets criticized for charging fees! The large models are caught in a price war.

In the past month, the commercialization of large models has witnessed the most dramatic and divisive scene.

On one hand, Doubao, under ByteDance, began to test the paid model. As a result, the topic "Doubao is stupid and charges" quickly topped the hot search list, and users mercilessly complained. On the other hand, DeepSeek-V4-Pro directly slashed the API price to 25% off, and then the price for input cache hits dropped to 1/10 of the original price.

On May 22nd, DeepSeek also announced that starting from June 1st, the current promotional price will be directly converted to the official price, and the original price will not be restored. Therefore, Liang Wenfeng was hailed as "Liang Sheng" in the developer circle.

▲ Weibo hot searches (left) and Xiaohongshu hot posts (right) about the price increase of large models

Meanwhile, the drama of "complaining while competing" is also unfolding.

Luo Fuli, the person in charge of Xiaomi's MiMo large model, first posted an article to "criticize" the price war of large models in the industry. Then, Xiaomi's MiMo rushed to the first place in the global call volume on Hermes with the "100-trillion Token free plan".

▲ Partial screenshot of Luo Fuli's post on X (Source: X)

Currently, there is a serious disconnect between the domestic and overseas prices: Under the same call scale, the price of the long-context version of the overseas giant GPT-5.5 has reached more than 40 times that of the domestic DeepSeek-V4-Pro.

To some extent, we can foresee the "helpless choice" of domestic large model manufacturers: Everyone knows that AI is very costly, but at this juncture, should they raise prices to recover costs or continue to lower prices to seize the ecosystem?

To figure out this "AI account", we deeply sorted out and compared the subscription packages, API call fees, and video generation prices of dozens of mainstream large model manufacturers at home and abroad.

Obviously, large model manufacturers are collectively saying goodbye to generous subsidies, and the era of freely taking advantage of large models is coming to an end.

01. Subscription system reshuffle: Farewell to "unlimited calls"

The free lunch of large models is being taken off the table collectively

Compared with the extensive era in early 2024 when a ChatGPT Plus membership could dominate the market, the charging system of domestic large models has undergone a qualitative change.

The core action is: Manufacturers no longer bear the cost of unrestrained computing power consumption, and the pure "unlimited calls" have almost completely disappeared. Instead, a set of complex measurement systems such as Credits, Tokens, and Agent fuel values have emerged.

▲ Comparison of domestic large model subscription package prices (Compiled by Zhidx, statistics as of: 2026/05/21)

We can intuitively see that the membership systems of domestic mainstream platforms have formed three clearly defined "price bands":

The first tier can be regarded as the "attractive price", and most platforms set the entry threshold below 50 yuan.

The entry-level packages of MiniMax, Xiaomi, Kimi, Zhipu, ByteDance, etc. are mostly around 40 yuan. However, the starting price of Alibaba's Token Plan standard version is 198 yuan per month, which is significantly higher, perhaps due to its larger Token quota and support for multimodal capabilities.

The second tier is concentrated in the range of 80 yuan to 200 yuan, which is also the most competitive mainstream price band at present.

At least 8 platforms, including Alibaba, Baidu, Jieyue Xingchen, MiniMax, Xiaomi, Tencent, Zhipu, and ByteDance, have placed a core upgrade package in this range.

Going further up, it enters the heavy productivity area.

The high-end packages of many platforms have exceeded 500 yuan per month, and the highest packages of ByteDance and Alibaba have reached the thousand-yuan level. Among them, the highest price of Alibaba's Token Plan premium version has reached 1,398 yuan per month.

Looking overseas, the situation is even more radical.

▲ Comparison of overseas large model subscription package prices (Compiled by Zhidx, statistics as of: 2026/05/21)

The lowest-tier products are generally around 8 US dollars (equivalent to about 54.3 yuan), and the mainstream subscriptions are concentrated around 20 US dollars (equivalent to about 136 yuan).

Meanwhile, the "Big Three" overseas have also begun to quickly move towards high-end memberships covering fees of 100 US dollars or even more than 250 US dollars. After Google I/O 2026 in May this year, Google actively lowered the price of Gemini Ultra from 249.99 US dollars per month to 199.99 US dollars per month (equivalent to about 1,359.9 yuan) and added a 99.99 US dollar tier.

Compared with the domestic price system, the monthly fee of a Gemini Ultra (high-end version) is already close to the annual fee level in China. And the 100 US dollar tier of Gemini Ultra (basic version), ChatGPT Pro, and Claude Max are basically 3 to 5 times that of the domestic mid - to high - end packages.

02. Interface price war:

Overseas giants maintain high prices, while domestic players seize market share

If the subscription system for ordinary users has the element of "attracting new customers", then the API prices for developers and the Agent ecosystem expose the differences in the commercial routes at home and abroad.

Comparing the API price lists, the gap is astonishing.

Taking DeepSeek-V4-Pro as an example, the comprehensive price of its API input and output has been compressed to about 9 yuan per million tokens at the lowest.

▲ Comparison of the latest API prices of domestic flagship large models (Compiled by Zhidx, statistics as of: 2026/05/21. Note: The API price of DeepSeek-V4-Pro has been officially adjusted to the current promotional price, and the original price will not be restored.)

In contrast, overseas: The total price of Gemini 3.1 Pro Preview in the long-context scenario reaches 149.6 yuan; Claude Opus 4.7 reaches 204 yuan; and the long-context version of GPT-5.5 reaches 374 yuan.

▲ Comparison of the latest API prices of overseas flagship large models (Compiled by Zhidx, statistics as of: 2026/05/21)

Behind the data are completely different strategies.

The overseas "Big Three", OpenAI, Google, and Anthropic, try to maintain high gross profit and high ARPU (Average Revenue Per User) and rely on high - end enterprise customers to cover huge computing power costs. In contrast, domestic manufacturers crazily compress profits or even subsidize at a loss, aiming to firmly hold on to the Agent ecosystem and the developer market within a short window period.

03. The price of a video has increased nearly 8 times:

The most costly AI capability can no longer withstand the pressure

Among all AI capabilities, video generation is the biggest "GPU - guzzling" money - eater and the hardest - hit area with the most obvious price increase.

Currently, domestic video generation models have formed several leading players, such as ByteDance's Seedance 2.0, Kuaishou's Keling, MiniMax's Hailuo, and Alibaba's HappyHorse. Similarly, compared with the early extensive model of "charging by membership", its charging system is getting closer to the "cloud GPU rental logic".

▲ Comparison of the prices of domestic mainstream video generation large models (Compiled by Zhidx, statistics as of: 2026/05/21)

Different resolutions, different generation durations, whether there is sound, whether there is a reference video, and whether to accelerate the queue will directly affect the price.

Taking ByteDance's phenomenal model Seedance 2.0 as an example, its price change is a "microcosm of the industry": It adjusted the charging three times in less than a month. The generation cost of a 15 - second video has risen from about 0.65 yuan to about 5 yuan, an increase of nearly 6.7 times.

When the peak - hour traffic surges in the evening, users either have to wait in a long queue or use the high - priced "VIP points" to buy the priority of computing power.

04. DeepSeek slashes prices sharply:

Why does Liang Wenfeng dare to cut the price to rock - bottom?

In the entire price war, the most special player is still DeepSeek.

Against the background of the entire industry quietly tightening benefits, DeepSeek's two consecutive significant price cuts at the end of April were extremely eye - catching. Currently, the input price for cache hits of DeepSeek-V4-Pro has been slashed to an astonishing 0.025 yuan per million tokens.

▲ Source: DeepSeek official

Without fully achieving self - sufficiency, why does DeepSeek still dare to "cut prices twice"?

The answer to the mystery lies, on one hand, in the capital infusion: A huge financing window rumored to be as high as 50 billion yuan is opening. The more core trump card comes from the in - depth reconstruction of the underlying hardware and computing power architecture.

The latest flagship, DeepSeek-V4, not only radically optimizes the long - context efficiency but also, more importantly, actively adapts to domestic chips such as Huawei and Cambricon.

This highly efficient combination of domestic hardware and algorithm optimization has significantly reduced the single - token inference computation and KV Cache occupancy in the million - token scenario. Compared with DeepSeek-V3.2, its million - token inference cost has dropped by 73%.

Simply put, DeepSeek is achieving "cost reduction while price reduction" through underlying technology.

If the large - scale implementation of Huawei's Ascend 950 super - node continues in the future and the cost of domestic computing power continues to decline, then this extremely low - price strategy may further impact the price system of the entire AI market.

05. The call volume soars, but the financial reports are still in the red:

Manufacturers still need to keep moving forward

However, the overall financial situation of the large model industry is still dismal.

Judging from the side - data of OpenRouter, users' demand is indeed exploding, and Tokens have completely entered a "boiling" state.

In the past six months, the presence of domestic models has been rapidly increasing. Domestic models such as DeepSeek, Tencent HY, Alibaba Qwen, Kimi, MiniMax, and Xiaomi Mimo have begun to continuously appear in the global Token share ranking.

▲ Market share of large models (Source: OpenRouter)

The latest weekly call data from OpenRouter shows that the call volumes of DeepSeek-V4-Flash and Hy3 preview have entered the first echelon and are comparable to each other, with both reaching 1.46T

该文观点仅代表作者本人，36氪平台仅提供信息存储空间服务。

Collective price hikes: Large language models start asking you for "money".

01.

Subscription system reshuffle: Farewell to "unlimited calls"

The free lunch of large models is being taken off the table collectively

02.

Interface price war:

Overseas giants maintain high prices, while domestic players seize market share

03.

The price of a video has increased nearly 8 times:

The most costly AI capability can no longer withstand the pressure

04.

DeepSeek slashes prices sharply:

Why does Liang Wenfeng dare to cut the price to rock - bottom?

05.

The call volume soars, but the financial reports are still in the red:

Manufacturers still need to keep moving forward