Xiaomi hat DeepSeek in den Preiswettbewerb gefolgt und die Preise um 99% gesenkt, um sich vollständig zu messen.
DeepSeek has just announced that the API prices will be permanently reduced, and Xiaomi has followed suit.
According to information from Zhidx on May 27th, Xiaomi officially announced today that the prices of the MiMo - V2.5 series API will be permanently reduced, and at the same time, the new token - plan billing system will be updated. Compared with the original prices, the maximum price reduction is up to 99%, and there will no longer be a distinction based on the length of the context window.
▲ The prices of the MiMo - V2.5 series API will be permanently reduced (Source: Xiaomi)
▲ Xiaomi MiMo's new token - plan billing system: The price remains the same, but the credits are significantly increased (Source: Xiaomi)
A few days ago, DeepSeek announced that starting from June 1st, the current special price of DeepSeek - V4 - Pro will directly become the regular price and will no longer be increased to the original price. Previously, DeepSeek had reduced the price of the V4 - Pro API to 25% of the original price and further pushed the price for input cache hits to 1/10 of the original price.
Within just one week, two Chinese large - language - model manufacturers have successively decided to permanently reduce the prices, and the API price competition in China has intensified again.
This time, Xiaomi MiMo has almost directly aligned its prices with DeepSeek's current API prices. The updated price list shows that the price for input cache hits of MiMo - V2.5 has dropped to 0.02 yuan per million tokens, the price for non - hit inputs is 1 yuan per million tokens, and the output price is 2 yuan per million tokens. For MiMo - V2.5 - Pro, the prices are 0.025 yuan, 3 yuan, and 6 yuan.
▲ Comparison table of the API prices of DeepSeek and Xiaomi MiMo (compiled by Zhidx)
It is worth noting that Xiaomi has abandoned the strategy of differentiating prices according to the length of the context for MiMo. Regardless of whether it is a 256K or 1M context window, a unified price will be applied.
Yesterday, we conducted a comprehensive analysis and comparison of the subscription plans and API billings of several leading large - language - model manufacturers in China and abroad.
DeepSeek has permanently reduced the prices, Alibaba has discontinued the affordable Lite package in the coding plan, ByteDance has removed the affordable coding plan from the platform, and Zhipu has increased the API billing by 83% in the first quarter of 2026. According to incomplete statistics, at least five Chinese large - language - model manufacturers such as Xiaomi, ByteDance, Alibaba, Zhipu, and Tencent have significantly adjusted their package systems in the past six months. Some manufacturers have begun to reduce the affordable packages and decrease the quotas, resulting in an increase in the overall prices.
Interestingly, Luo Fuli, the leader of the Xiaomi MiMo model, recently publicly protested against the price competition in the industry. On the other hand, thanks to the "100 - billion - token free plan", Xiaomi MiMo once reached the top of the global call numbers on Hermes.
▲ Excerpt from Luo Fuli's post on X (Source: X)
Now, Xiaomi MiMo has officially participated in this long - term price competition.
01.
Major overhaul of the token plan:
The capacity of the packages is increased by 5 to 8 times
Aside from the permanent reduction of the API prices, the biggest change is actually the token - plan package system.
Xiaomi has explained that the new billing rules have been re - structured. With the prices remaining the same, the credits of the packages are significantly increased, and the consumption amounts generally reach 5 to 8 times the original value.
From our analysis of the subscription plans of leading Chinese large - language - model manufacturers, it can be seen that after Xiaomi's adjustment, in the entry - level segment, Xiaomi's Lite package is similar to the lowest segments of manufacturers such as Kimi, ByteDance, and Jieyue Xingchen, but it is not the lowest in the market. Tencent Hunyuan Hy currently still has a Lite segment for 28 yuan per month.
In the higher segment, Xiaomi's Max package is also not the most expensive. Currently, the price of Alibaba's Premium package is 1,398 yuan per month, the Max tariff of ByteDance's Agent plan is 950 yuan per month, and the MiniMax Ultra - fast version costs almost 750 yuan per month.
▲ Comparison of the prices of the subscription plans of Chinese large - language models (compiled by Zhidx, as of May 27th, 2026)
At the same time, Xiaomi has announced the new conversion relationship between credits and tokens and disclosed the approximate actual token amount for different packages with a high cache - hit rate.
▲ Conversion relationship between credits and tokens of Xiaomi MiMo
According to Xiaomi's estimate, with a cache - hit rate of over 95%, the 39 - yuan Lite package can theoretically reach over 5 billion tokens when using MiMo - V2.5, the 99 - yuan Standard segment can reach over 1.3 billion tokens, the 329 - yuan Pro segment can reach over 4.7 billion tokens, and the 659 - yuan Max segment can reach over 10 billion tokens.
Even the more expensive MiMo - V2.5 - Pro can reach over 190 million tokens in the 39 - yuan Lite segment with a high cache - hit rate, the 99 - yuan package can reach over 500 million tokens, the 329 - yuan segment can reach about 1.8 billion tokens, and the 659 - yuan Max segment can reach almost 4 billion tokens.
Xiaomi has particularly emphasized that in agent and code scenarios, the actual number of available tokens will be significantly increased due to the usually higher cache - hit rate.
Xiaomi's "Million - Billion - Token Creator Incentive Program" has also attracted the attention of the developer community. Xiaomi has announced that as of 4:08 p.m. on May 26th, all 100 billion tokens have already been distributed ahead of schedule and the event has ended ahead of schedule. All token - plan users whose subscriptions are still valid will have all their credit quotas reset at 12:00 a.m. on May 27th and will be automatically switched to the new billing rule.
For former paid users whose subscriptions have already expired, Xiaomi will offer additional benefits later.
02.
Why does Xiaomi dare to reduce the prices?
Continuous optimization of the inference system
Xiaomi has also explained the underlying optimization solution for the inference this time.
Based on SGLang HiCache, the Xiaomi team fully supports SWA (Sliding Window Attention), which reduces the data traffic of KV - cache between different storage levels such as GPU memory, CPU memory, and SSD to about 1/7 of the value before optimization, and increases the number of cacheable tokens to about 5 times. At the same time, Xiaomi has also optimized mechanisms such as the parallel expert schema and the input - length categorization strategy to further improve the input throughput of the cluster and thus reduce the cost per token.
Simply put, Xiaomi's core logic is similar to that of DeepSeek: on the one hand, a more aggressive cache - hit strategy, and on the other hand, higher inference throughput.
Essentially, the price competition is a competition between the inference system and the infrastructure capability.
03.
After the model has entered the top - tier group
Does Xiaomi MiMo play the price card
On April 23rd this year, Xiaomi launched the official beta version of the Xiaomi MiMo - V2.5 model and introduced several versions such as MiMo - V2.5, V2.5 - Pro, V2.5 - TTS series, and V2.5 - ASR. Among them, MiMo - V2.5 - Pro is specifically targeted at scenarios such as agents, complex software development, and long - term tasks.
Currently, MiMo - V2.5 - Pro ranks first among the global open - source models in the Artificial Analysis overall intelligence list and is also among the top five among the global large - language models; its agent index also ranks first among the global open - source models.
On April 28th, Xiaomi officially released the MiMo - V2.5 series as open - source under the MIT license, which supports commercial deployment and secondary training without additional approval.
What made MiMo well - known in the developer community quickly was the previous "100 - billion - token free plan".
On May 9th, Hermes Agent ("Hermes") overtook OpenClaw ("Lobster") for the first time and reached the top of the global call numbers on OpenRouter. At this time, MiMo - V2 - Pro ranked first among the five models with the most monthly calls on Hermes, followed by MiniMax M2.7, NVIDIA Nemotron 3 Super, Jieyue Xingchen Step 3.5 Flash, and Tencent Hy3 preview.
However, the rankings change very quickly.
As of 9:00 a.m. on May 27th, MiMo - V2.5 - Pro ranked fourteenth in the OpenRouter weekly call list;
▲ OpenRouter weekly call list (as of 9:00 a.m. on May 27th)
In the monthly call list of Hermes Agent models, MiMo - V2 - Pro has dropped from the top to the 16th place.