HomeArticle

When AI large models are repriced: Unisound launches U2, ushering in the 'DeepSeek Moment'

晓曦2026-06-08 19:23
In the era of native agents, large models represent a pure and efficient productive force.

The large model industry has long been immersed in a consensus that is almost taken for granted as correct.

Only with large parameters can the model be powerful; only with a sufficiently long context can the capabilities be comprehensive; only with a sufficiently complex reasoning chain can its intelligence level be demonstrated.

Therefore, in the past few years, from hundreds of billions of parameters to trillions of parameters, from hundreds of thousands of token contexts to millions of tokens, and from single responses to increasingly long reasoning, large model companies have continuously pushed the technological boundaries, and the capital market has also been willing to pay for such stronger imaginations. The model rankings have changed frequently, and the training costs have been continuously pushed up. GPUs have become one of the most expensive means of production.

However, when the enthusiasm for blindly piling up parameters fades, the industry begins to face an unavoidable and embarrassing reality: Dense models with thousands of billions or even trillions of parameters have simultaneously pushed up astronomical training and inference costs, erecting extremely high deployment thresholds.

Whether for high - flying enterprises or individual geeks, there is a huge gap between the ideal "emergence of intelligence" and the reality of "unaffordable and hard to use".

The more crazy the first half was, the more harsh the second half will be.

At this moment, some far - sighted players have realized that a paradigm shift is taking place. Generative AI is comprehensively evolving into productive AI.

It is necessary to smoothly integrate sufficiently hardcore intelligent capabilities into the real industrial capillaries with lower comprehensive costs and more stable delivery methods.

At this critical juncture that determines the trend of the industry, an AI veteran has provided a solution for reconstructing efficiency through a hardcore upgrade of the base.

Today, iFlytek Zhisheng officially released its new - generation general large - language model base, U2.

This is not only the most important technical iteration of the base model since iFlytek Zhisheng's listing, but also a key milestone in its comprehensive transformation into a "native agent large - model company".

While the industry was still chasing the chat skills of generative AI, iFlytek Zhisheng prospectively proposed the brand - new concept of "productive AI", and the meaning behind it is very clear: The ultimate value of AI is not to generate content, but to solve complex tasks in the real world.

The proposal of this concept has made iFlytek Zhisheng a well - deserved pioneer: While peers were still talking about "emergence", iFlytek Zhisheng had already started thinking about "getting things done"; while others were competing in parameter scale, it had already seen through the commercial essence of intelligent density and token value. It is this forward - looking cognitive stance that has allowed iFlytek Zhisheng to gain the right to define the rules before the main competition of large models begins.

This enterprise, which has been struggling in the AI field for more than a decade, did not choose to join the blind parameter consumption war. Instead, at this juncture, it used the new underlying logic represented by U2 to announce to the industry the recalculation of the commercial value of large models. iFlytek Zhisheng has built an almost irreplicable moat with more than a decade of know - how in vertical scenarios, which is also the fundamental confidence for it to remain at the core of the first echelon of domestic large models.

It's about sound, but more than just sound

To understand the second half of the large - model era, one must first see clearly the players at the table.

In the domestic large - model camp where heroes compete for supremacy, iFlytek Zhisheng, founded in 2012, is a rather unique existence. It first entered the field through speech recognition and was once active in scenarios such as smart healthcare, smart home, and in - vehicle cockpits. Over the past decade or more, it has experienced the complete technical cycle from statistical learning, deep learning to the large - model era. Therefore, it is often regarded as a somewhat "old - fashioned" AI player.

For a long time in the past, due to the word "sound" in its name, the outside world habitually labeled it as a speech - recognition company. During the peak of the large - model craze, when the outside world's attention was attracted by the new Internet upstarts and the "Six Little Tigers" that had raised billions of dollars in financing and were in the spotlight, iFlytek Zhisheng, which was in the silent period after its listing on the Hong Kong Stock Exchange, seemed rather low - key.

Fortunately, the era of chatty generative AI ended in 2025, and everyone realized that productive AI that can get things done is what matters. At this moment, the industry suddenly found that iFlytek Zhisheng's previously underestimated core business has instead become its broadest moat in the era of agents.

"Behind sound is language, and behind language is intention. What we listen to is not just sound, but the consciousness behind the sound." Huang Wei, the founder of iFlytek Zhisheng, explained the "sound" in the company's name in this way.

In his understanding, there are always three levels of human - machine interaction: The first level is "understanding speech", that is, speech recognition, converting sound into text; the second level is "understanding intention". When a user says "I'm cold", they don't just want a response, but hope that the air - conditioner will automatically adjust the temperature and the curtains will automatically close; the third level is to understand deeper consciousness and scenarios - when an elderly person living alone casually says "I have nothing to do today", can the AI recognize loneliness from the tone and pauses and actively trigger companionship or reminders?

From speech recognition, natural language understanding to today's large models and agents, iFlytek Zhisheng has always been doing one thing: making machines truly understand humans and helping humans get things done.

In the real physical world of human - machine interaction, to make machines truly serve humans, a series of engineering problems such as multi - round interaction, long - chain tasks, complex environmental noise, and human - machine collaboration must be solved.

This scenario know - how and multi - round interaction engineering experience gained from struggling in serious and complex vertical scenarios is the natural soil for the survival of native agents.

Based on this profound insight, the newly released general large base U2 of iFlytek Zhisheng is positioned internally as a native agent large model (Agent - Native Model). Its size, training goals, and optimization directions are all designed around "executing tasks".

In terms of the technical path, iFlytek Zhisheng did not follow the industry - wide route of "completing model training and then attaching an agent framework externally". Instead, it proposed a more radical idea:

First, it is the native agent model + Harness collaborative evolution mechanism. In the past, most agent systems were more like putting a shell around a general chat model - the model only talked, and tasks such as planning, tool invocation, and task execution were all handed over to an external framework. In fact, the model itself did not really "understand" these things. However, U2 directly internalized the complete capabilities of how to plan, execute, and accept results into the model layer during the training stage. During the training process, the model and the Harness (task execution framework) continuously evolved collaboratively: as the main structure of the model became more and more complex, the support nodes and verification accuracy of the framework also extended and became more refined; and a more precise and strict framework, in turn, ensured the solidity of each layer of the model's logic, forming a self - strengthening cycle.

Second, it is the systematic application of process supervision and curriculum learning. To make the agent as efficient as a person who can get things done neatly, U2 introduced the "curriculum learning" method in the training stage, allowing the model to progress step by step from easy to difficult, from short to long context, and from simple to complex tool invocation. In the trajectory of long - term tasks, U2 introduced an advanced process supervision method, using a better model to disassemble, evaluate, and correct each key node of task execution. U2 can not only see the final result but also optimize each intermediate execution path, achieving rapid convergence of learning.

Third, it is a more industry - level data ratio that favors the real economy and hardcore industries. While many large models still highly rely on general Internet corpora for generalization training, iFlytek Zhisheng actively reduced the proportion of corpora from low - value scenarios such as entertainment and tilted more data resources towards high - value industry scenarios such as healthcare, medical insurance, insurance, government affairs, and industry. It also conducted training by combining desensitized data from real scenarios accumulated over years of business implementation. It is worth mentioning that iFlytek Zhisheng synthesized and trained with desensitized data from real scenarios that have been long - term accumulated and are difficult to replicate in its business, directly serving the real economy and hardcore industries.

After the reconstruction of the underlying capabilities, iFlytek Zhisheng's U2 demonstrated strong performance competitiveness without blindly piling up parameters. In evaluations such as IFBench for instruction following, U2's performance ranked among the top in the industry; in Claw - related evaluations, its agent and tool invocation capabilities showed strong advantages; in hardcore knowledge reasoning and long - context tasks such as GPQA, U2 also demonstrated the ability to challenge the world's top large models; in GDPval, which evaluates the delivery ability for real - world office and knowledge work, U2 scored 72.5 points, showing solid professional office capabilities.

Most importantly, U2 completely broke the spell that "first - class performance must be bound to super - large parameters". It rejects parameter bloat and, through the ultimate MoE (Mixture of Experts) architecture and algorithm optimization, is committed to compressing capabilities comparable to the world's first - class into a smaller parameter scale, pursuing strength in a small and cost - effective package.

This low - key and restrained AI veteran has joined the first echelon of domestic large models in a leading position.

How to close the business logic loop?

As a technology company with more than a decade of industry experience, iFlytek Zhisheng knows better than the "newcomers" who have only entered the industry in recent years that while it is necessary to strive for technological generational leaps, business logic cannot be ignored.

In the past, the large - model industry was used to discussing tokens from the single perspective of hardware and computing power. When the entire industry was competing for who could generate more tokens and who had higher computing efficiency, Huang Wei, the founder of iFlytek Zhisheng, calculated a more penetrating business account: "If the same one million tokens are generated but they are all for chatting and nonsense, then high computing efficiency has no commercial value at all."

Based on this understanding, iFlytek Zhisheng proposed a highly disruptive business formula for the first time in the industry:

AI commercial value = intelligent density × token value.

Breaking it down, intelligent density means achieving a sufficiently high level of intelligence with smaller parameters and lower comprehensive resource input. Token value emphasizes that each invocation of the model must be directly convertible into measurable business results - either reducing risks or increasing productivity.

The newly released U2 model is the ultimate implementation carrier of this understanding and thinking. To ensure that every penny of the customer is spent effectively, U2 has achieved almost strict optimization in its underlying technology.

The previously mentioned native agent + Harness collaborative evolution mechanism is exactly to solve this problem. Through the co - evolution of the model and the tool - chain, U2 can complete task planning, tool invocation, execution, and acceptance with fewer interaction rounds, reducing the waste of tokens caused by a large number of trial - and - error processes and further improving the task completion rate.

At the same time, U2 uses a sparse Mixture of Experts (MoE) architecture at the bottom. Compared with traditional dense models that need to activate all parameters, MoE only activates the most relevant part of the expert models for different tasks. According to the information disclosed by iFlytek Zhisheng, U2 only activates about one - tenth of the parameters for calculation each time it processes a task, and the remaining parameters "sleep on demand". This means that the actual computational volume of the model during operation is much smaller than its full scale, significantly reducing the computing power cost required for inference while maintaining high performance.

What's more special is the redesigned thinking process of U2. Some large models often expand a long reasoning process during complex reasoning - writing out the intermediate process step by step. Although this method improves interpretability, it also brings another problem: users are paying for a large number of tokens that do not generate ultimate value. U2 preferentially conducts efficient exploration in the latent space to avoid decoding each intermediate thinking step into visible tokens; when the task enters a critical stage, the model switches to explicit reasoning, completing logical calibration, process verification, and final decision - making through a readable and verifiable reasoning process. iFlytek Zhisheng calls this "implicit thinking reasoning + explicit thinking verification".

"If these one million tokens are all for chatting nonsense, high efficiency has little commercial value," Huang Wei once said.

This strategy of "pursuing high - value tokens with high intelligent density" soon received real - money feedback on the commercial battlefield, and the results demonstrated the infinite potential brought by the new formula.

The latest data shows that benefiting from the sharp increase in the demand for high - quality scenario tokens, the ARR of iFlytek Zhisheng's token invocation revenue in May increased by 600% month - on - month, and according to the current order momentum, it will continue to maintain a strong high - growth trend in June, with the ARR expected to reach 15 million US dollars.

In the second half of the large - model era, the ceiling of iFlytek Zhisheng's business scale has been fully and completely lifted.

Behind this is an essential leap in iFlytek Zhisheng's business model.

For a long time, traditional ToB companies have been deeply trapped in the quagmire of project - based models - long delivery cycles, high customization levels, and hard - earned money from one - off deals. However, with the release of the large model U2, through the continuous output of high - value tokens, iFlytek Zhisheng's revenue model has been successfully and deeply linked to the intensity of customers' AI usage. As long as customers continuously invoke AI in real business processes, revenue will flow like an open faucet, generating high - frequency, high - margin repeat purchases.

Now, this efficient business closed - loop is being accelerated through iFlytek Zhisheng's unique dual - wheel - driven layout:

In the ToB end (Shouya Agent Platform), iFlytek Zhisheng uses U2 as the core base to make inroads in vertical industries. With an extremely high task completion rate, the company has recently won a series of bids for industry scenarios with strict accuracy requirements, such as healthcare, medical insurance, transportation, customer service, and employee badges. These high - value industries not only continuously contribute high customer unit prices but also use the real and high - quality business data accumulated in the scenarios to continuously feed back to the model base, forming a virtuous cycle of becoming smarter and having higher intelligent density with more use.

In the ToC and developer end (public - cloud MaaS), iFlytek Zhisheng has fully expanded based on the OPC ecosystem. Through the lower - threshold and more cost - effective model API invocation capabilities, it continuously and stably harvests high - frequency token traffic and revenue from a wide range of independent developers and the C - end application ecosystem.

Instead of blindly following the limit of the Scaling Law, iFlytek Zhisheng found an ecological niche that perfectly matches its own resources and endowments. With the hardcore technology - driven hematopoietic ability of a six - fold increase in monthly ARR, iFlytek Zhisheng has proven that in the second half of the large - model era, only players who can calculate the efficiency account clearly and hold the business closed - loop firmly have a truly infinite future.

The main competition has just begun

Looking back on the year since its listing on the Hong Kong Stock Exchange, iFlytek Zhisheng has submitted a hardcore answer sheet that does not blindly follow the trend, is not divorced from reality, reconstructs efficiency with technology, and proves itself with business results. When the entire industry was deeply trapped in parameter inflation and paying for expensive computing power experiments, this AI veteran achieved technological foresight and found commercial value through a clear strategic orientation and more than a decade