Before the release of DeepSeek V4, Luo Fuli made a bold move, and Xiaomi's most powerful model, MiMo-V2.5, launched a surprise attack late at night.
Just one month after its last update, the capabilities of Xiaomi's large model have skyrocketed, saving 42% on Tokens compared to Kimi K2.6.
According to a report by Zhidx on April 23rd, Xiaomi's MiMo large model has just officially announced four new models: the flagship inference model MiMo-V2.5, the full-modal Agent model V2.5-Pro is now open for public testing and will soon be open-sourced; V2.5-TTS Series and V2.5-ASR are upcoming.
The person in charge of Xiaomi's MiMo large model is Luo Fuli, a former core member of DeepSeek and known in the industry as the "genius girl." It has only been 36 days since the last major triple update of the MiMo-V2 series. When the previous generation model was released, Luo Fuli once posted that "they will open-source the model once it is stable enough in the future."
Similar to the previous generation model, the entire MiMo-V2.5 series is also designed for intelligent agent scenarios. MiMo-V2.5-Pro is specifically designed for long and difficult Agent tasks, while MiMo-V2.5 can cover most general Agent scenarios.
Xiaomi has also thoughtfully provided users with an official usage guide: MiMo-V2.5 supports native full-modal Agent capabilities, covering images, audio, and video. Compared with the Pro version, it has a faster average inference speed and is more suitable for tasks sensitive to latency.
In addition to performance, another major upgrade of Xiaomi's new MiMo model is higher Token efficiency. According to official information, when achieving the same score on the intelligent agent benchmark list ClawEval:
MiMo-V2.5-Pro saves 42% on Tokens compared to Kimi K2.6, an open-source flagship multi-modal intelligent agent model released by Kimi this week; MiMo-V2.5 saves 50% on Tokens compared to Muse Spark, a closed-source multi-modal inference model released by Meta at the beginning of this month.
In addition, Xiaomi has comprehensively upgraded its model subscription plan, Token Plan: eliminated the 4x Credits billing method, no longer differentiates between 256k and 1M context for billing, offers exclusive discounted rates at night, and added an auto-renewal mode. It's worth mentioning that when the Token Plan was first released, many users complained that the price was too high and the cheaper packages didn't have enough Tokens.
Zhidx tested MiMo-V2.5-Pro by asking it to "create a 3D side-scrolling fighting game." In just a few minutes, MiMo-V2.5-Pro wrote 1,123 lines of code and generated a "Dragon and Tiger Fighting Game." The game interface clearly shows the health bars, character names, countdown, and battle prompts. It also includes feedback systems such as hit sparks, blocking fragments, camera shake, and hit pauses, making it somewhat playable. However, the character models are simple, with little difference except for color and hats.
Interface of the Dragon and Tiger Fighting Game
Interestingly, in March this year, Xiaomi's MiMo-V2-Pro appeared on the OpenRouter platform anonymously as Hunter Alpha and was once mistaken by developers for the upcoming DeepSeek V4. Now, with the release of Xiaomi's new generation MiMo-V2.5, it coincides with the news that DeepSeek V4 will be released this week.
Xiaomi MiMo Open Platform:
https://platform.xiaomimimo.com
Xiaomi MiMo Studio Experience Address:
https://aistudio.xiaomimimo.com/#/c
01. MiMo-V2.5-Pro: Specializing in Long and Difficult Intelligent Agent Tasks, Completing an Undergraduate Project in 4.3 Hours
Xiaomi officially claims that MiMo-V2.5-Pro is the most powerful model of Xiaomi's MiMo to date. In terms of general intelligent agent capabilities, complex software engineering, and long-range tasks, it can match global top Agent models such as Claude Opus 4.6 and GPT-5.4, and has improved compared to the previous generation model, MiMo-V2-Pro.
According to Xiaomi's internal tests, when paired with a suitable operating framework, MiMo-V2.5-Pro can stably complete long-range tasks involving nearly a thousand rounds of tool calls in a single instance. In intelligent agent scenarios, the model's ability to follow instructions has improved. It can capture implicit requirements in the context and maintain logical consistency over an extended period.
Based on the evaluation suite MiMo Coding Bench developed by Xiaomi's MiMo team, the gap between MiMo-V2.5-Pro and Claude Opus 4.6 has further narrowed. Their scores are 73.7 and 77.1 respectively, while MiMo-V2-Pro scored 71.5.
A Twitter user tested MiMo-V2.5-Pro with a previously popular question: "I want to go to a car wash, and the car wash is 50 meters away. Should I walk or drive?" MiMo-V2.5-Pro lived up to expectations and gave the correct answer.
Xiaomi MiMo has released several practical cases of MiMo-V2.5-Pro.
The first is "Implement a complete SysY compiler in Rust." The difficulty of this task lies in that the model needs to independently complete the lexical analyzer, syntax analyzer, AST, Koopa IR code generation, RISC-V assembly backend, and performance optimization.
In actual operation, the model builds the entire compiler layer by layer. It first constructs the complete pipeline framework and then tackles each layer. In the specific scores of each item, the model scored full marks in Koopa IR, RISC-V backend, and performance optimization. The first compilation achieved a 59% cold-start pass rate, which means the architecture was correct before running any tests. At the 512th round, the model's one-time reconstruction caused the lv9/riscv to regress by two test points; the model self-diagnosed, recovered, and continued to progress.
This task is a project for the "Principles of Compilation" course at Peking University. It usually takes Peking University undergraduates several weeks to complete, but MiMo-V2.5-Pro completed it in 4.3 hours with 672 tool calls and achieved a full score of 233 on the hidden test set.
The second official case is the development of a video editor. The prompt was "Build a video editor web application." The web application delivered by MiMo-V2.5-Pro has functions such as multi-track timeline, clip trimming, cross-fading, audio mixing, and export process. The final code volume reached 8,192 lines, with 1,868 tool calls, and was completed in 11.5 hours of autonomous work.
The third case is an analog circuit EDA task. The requirement was "Design and optimize a complete flipped voltage follower low-dropout linear regulator (FVF-LDO) from scratch based on the TSMC 180nm CMOS process."
During the task, the model needed to determine the power transistor size, adjust the compensation network, and select the appropriate bias voltage to ensure that six indicators, including phase margin, line regulation, load regulation, quiescent current, power supply rejection ratio, and transient response, met the specifications simultaneously.
Experienced analog circuit designers usually take several days to complete such projects. Researchers connected MiMo-V2.5-Pro to the ngspice simulation loop and used Claude Code as the simulation framework. After about an hour of closed-loop iteration, it generated a design that met all the target indicators, and the four indicators shown below improved by an order of magnitude compared to its initial version:
02. MiMo-V2.5: Can See, Hear, and Read, Responsible for General Intelligent Agent Scenarios
MiMo-V2.5 is a native full-modal model specifically designed for intelligent agent scenarios. It can see, hear, and read simultaneously and take actions based on the perceived information.
This model has two key upgrades this time: Agent capabilities comprehensively surpass MiMo-V2-Pro, and multi-modal perception comprehensively surpasses MiMo-V2-Omni. MiMo-V2-Pro is the previous generation flagship base large model of Xiaomi's MiMo series, and MiMo-V2-Omni is the previous generation full-modal Agent model.
Among them, in the end-to-end trustworthy evaluation benchmark Claw-Eval for AI intelligent agents, MiMo-V2.5 outperforms MiMo-V2-Pro, and the API cost is reduced by about 50%; in benchmark tests such as VideoMME, CharXiv, and MMMU-Pro for cross-modal reasoning, video understanding, and chart analysis, MiMo-V2.5 performs close to or even surpasses closed-source models such as Claude Opus 4.6, Gemini 3 Pro, and GPT-5.4.
In terms of programming, Xiaomi's internal MiMo programming benchmark test shows that MiMo-V2.5 surpasses Gemini 3.1 Pro in daily programming tasks but still has a significant gap with Claude Opus 4.6.
03. Token Plan: 8x Discount During Nighttime Exclusive Period, Save Nearly $1000 with Annual Subscription
Along with the release of the MiMo-V2.5 series, Xiaomi has also optimized its subscription-based API call package, MiMo Token Plan. This plan allows users to use 8 models of the MiMo series, namely the flagship inference models MiMo-V2-Pro and MiMo-V2.5-Pro, the all-around multi-modal models MiMo-V2-Omni and MiMo-V2.5, and the speech synthesis models MiMo-V2-TTS, MiMo-V2.5-TTS, MiMo-V2.5-TTS-VoiceClone, and MiMo-V2.5-TTS-VoiceDesign. The last three models are yet to be released.
First, the Credits rate has been updated and is more favorable. The billing method of 1 Token = 4 Credits has been eliminated, and the Credit multiplier for the 256k and 1M context windows is no longer differentiated.
The new Credits billing method for the models is as follows:
MiMo-V2.5: 1x (Consuming 1 Token = 1 Credit)
MiMo-V2.5-Pro: 2x (Consuming 1 Token = 2 Credits)
For comparison, the billing method when the MiMo Token Plan was first released was: