Just Now, Xiaomi Claims the Mysterious Large Model Speculated as DeepSeek V4 by the Whole Network, and You Can Also "Raise Lobsters" for Free

New work of Xiaomi's Luofuli team.

According to Zhidongxi on March 19th, early this morning, Xiaomi's MiMo large - model series had a significant triple update: the flagship base large model MiMo - V2 - Pro, the full - modality Agent model MiMo - V2 - Omni, and MiMo - V2 - TTS. These three newly released models are all designed to optimize the capabilities of intelligent agents.

Among them, the anonymous models Hunter Alpha and Healer Alpha, which topped the daily API call volume list on OpenRouter, the world's largest API aggregation platform, for many days last week and sparked heated discussions, are the early test versions of MiMo - V2 - Pro and MiMo - V2 - Omni. Currently, these two anonymous models are still freely available to developers on OpenRouter.

Previously, because the parameter specifications of Hunter Alpha were reported to be the same as those of DeepSeek V4, some people speculated that Hunter Alpha was DeepSeek V4. Peter Steinberger, the founder of OpenClaw, also posted on the social platform X to inquire about the specific information of these two anonymous models.

The flagship base model MiMo - V2 - Pro has a total number of parameters exceeding 1T. In intelligent agent frameworks such as OpenClaw and Claude Code, MiMo - V2 - Pro can complete complex workflow orchestration, long - term planning, and precise tool invocation without manual intervention. However, the pricing of its model API is only 1/5 of that of Claude Opus 4.6.

▲Price comparison between MiMo - V2 - Pro, Claude Opus 4.6, and Claude Sonnet 4.6 (Source: Xiaomi MiMo official website)

The full - modality base model Xiaomi MiMo - V2 - Omni supports full modalities of text, vision, and voice. This model can understand complex environments across modalities, independently formulate and execute plans, and correct strategies in real - time when encountering exceptions, ultimately delivering complete results end - to - end.

The large voice synthesis model Xiaomi MiMo - V2 - TTS aims to enable intelligent agents to communicate with people using a warm, emotional, and soulful voice. It supports the generation of multiple dialects, multiple roles, and multiple tones, and can also intelligently recognize various format signals such as punctuation marks, modal particles, and emphasis marks in the text.

In addition, on the official model experience page of MiMo - V2 - Pro, Xiaomi also launched MiMo Claw simultaneously. Users can experience "shrimp farming" based on MiMo - V2 - Pro. This function can be experienced for free for 30 minutes per creation, and the data will be automatically destroyed after exiting.

The person in charge of Xiaomi's MiMo large model is Luo Fuli, a former core member of DeepSeek and known as the "genius girl" in the industry.

Zhidongxi experienced MiMo Claw and asked it to "design a website for me that updates the list of companies listed on the Hong Kong Stock Exchange and A - shares the next day at 19:00 every day." MiMo Claw uses a Python crawler to regularly fetch data and then generates a static page for direct deployment. After detecting mismatches during the running test, it will correct and supplement the Hong Kong stock data.

▲The new stock radar website generated by MiMo Claw

MiMo - V2 - Pro and MiMo - V2 - Omni will cooperate with intelligent agent development framework teams such as OpenClaw, OpenCode, KiloCode, Blackbox, and Cline to provide one - week limited - time free interface support for global developers.

MiMo - V2 - Pro shrimp - farming experience page:

https://aistudio.xiaomimimo.com

01. MiMo - V2 - Pro: Ranked third in comprehensive capabilities in China

Ranked third on the OpenClaw list

The total number of parameters of MiMo - V2 - Pro exceeds 1T, and the number of activated parameters is 42B, which is about three times that of the previous - generation model MiMo - V2 - Flash. It supports a context length of 1 million.

On the global authoritative large - model comprehensive intelligence ranking list Artificial Analysis, MiMo - V2 - Pro ranks ninth globally and third in China, only after GLM - 5 of Zhipu and MiniMax - M2.7 newly released by MiniMax yesterday.

In various benchmark evaluations that measure the important capabilities of models, MiMo - V2 - Pro performs similarly to Claude Sonnet 4.6, GPT 5.2, and Gemini 3.0 Pro in terms of programming agents, general agents, and tool usage.

According to official information, MiMo - V2 - Pro is deeply optimized for Agent scenarios. It has undergone supervised fine - tuning and reinforcement learning for complex and diverse intelligent agent architectures, and has stronger tool - invocation and multi - step reasoning capabilities.

On the OpenClaw standard evaluation lists PinchBench and Claw - Eval, MiMo - V2 - Pro ranks third, only after Claude Sonnet 4.6 and Claude Opus 4.6. At the same time, based on a 1M ultra - long context window, MiMo - V2 - Pro can support high - intensity real - world Claw complex application flows.

In terms of programming, the in - depth evaluation results of Xiaomi's internal engineers show that MiMo - V2 - Pro feels close to Claude Opus 4.6 and demonstrates high - level programming intelligence. It has better system design and task - planning capabilities, a more elegant code style, and a more efficient and direct problem - solving path.

In front - end application scenarios, MiMo - V2 - Pro can generate a well - designed and fully functional web page in one step in OpenClaw.

Prompt: Imitate the aesthetics of 1990s printed magazines. Use serif fonts like Playfair Display for headings and monospaced fonts like IBM Plex Mono for the body text. Use a magazine - style multi - column grid layout with unequal column widths. Offset the main heading to the left of the viewport to imply print overflow. Apply a sepia 0.2 brown filter and noise overlay to images. Mimic the page - turning effect for page transitions. Design the navigation to resemble a magazine table of contents, with numbered items (01/02/03) that increase in size on hover. Design the bottom of the page to look like a magazine copyright page with a fake ISSN number. Use a paper - texture background.

In terms of price, it is priced in segments according to usage: within 256K context, it costs 1 US dollar (approximately 6.87 RMB) per million input tokens and 3 US dollars (approximately 20.62 RMB) per million output tokens; within 1M context, it costs 2 US dollars (approximately 13.75 RMB) per million input tokens and 6 US dollars (approximately 41.24 RMB) per million output tokens.

On the official model experience page, MiMo Claw was launched simultaneously to unlock the free shrimp - farming experience of MiMo - V2 - Pro. In addition, the MiMo Claw module has now fully integrated with the Kingsoft WebOffice ecosystem, natively supporting the four mainstream formats of Word, Excel, PPT, and PDF, covering over 95% of daily document types; Xiaomi Browser has also connected to MiMo - V2 - Pro to support AI search.

02. MiMo - V2 - Omni: Specializing in multi - modality interaction and execution

Can help you find strategies and bargain

The full - modality base model Xiaomi MiMo - V2 - Omni is designed for complex multi - modality interaction and execution scenarios in the real world, integrating full modalities of text, vision, and voice.

Perception ability, accurate perception, and precise reasoning are the cornerstones of efficient execution. In audio understanding, MiMo - V2 - Omni supports environmental sound classification, multi - speaker separation, audio - visual joint reasoning, and in - depth understanding of continuous long - audio of over 10 hours. Its performance surpasses that of Gemini 3 Pro; in image understanding, MiMo - V2 - Omni has multi - disciplinary visual reasoning and complex chart analysis capabilities, and its performance exceeds that of Claude Opus 4.6 and approaches that of Gemini 3 Pro; in video understanding, the new model supports native audio - video joint input, and its performance surpasses that of Gemini 3 Flash.

In terms of intelligent agent capabilities, MiMo - V2 - Omni can understand complex environments across modalities, independently formulate and execute plans, and correct strategies in real - time when encountering exceptions, ultimately delivering complete results end - to - end.

In the evaluation benchmark of interacting with the real digital environment, the performance of MiMo - V2 - Omni approaches that of Gemini 3 Pro. In pure - text intelligent agent tasks, its average performance is only second to that of Claude Opus 4.6.

Combined with the OpenClaw framework, MiMo - V2 - Omni can operate the browser like a human.

Prompt: Help me figure out how to choose a Xiaomi 17. Do some research on Xiaohongshu, select the right one, and then place an order on JD.com. Try to bargain as well.

The model will open Xiaohongshu by itself, browse posts, extract configuration comparisons, photo reviews, and real - user experiences, and then organize purchase suggestions. Then it will open JD.com to compare prices across stores, transfer to a human customer service to bargain, and directly add the item to the cart and place an order if the price is right.

MiMo - V2 - Omni is connected to WPS Office. With just a few words, it can directly generate Word documents, structured Excel spreadsheets, well - formatted PDF files, and complete PPT presentations for users.

该文观点仅代表作者本人，36氪平台仅提供信息存储空间服务。

Just now, the mysterious large model that the whole network guessed to be DeepSeek V4 was claimed by Xiaomi, and you can also "raise lobsters" for free.

01.

MiMo - V2 - Pro: Ranked third in comprehensive capabilities in China

Ranked third on the OpenClaw list

02.

MiMo - V2 - Omni: Specializing in multi - modality interaction and execution

Can help you find strategies and bargain