HomeArticle

Annual review of AI models

IT时报2026-01-05 18:30
Witness the transformation from a tool to a partner

When the first ray of morning light filters through the curtains, an "AI partner" that understands your schedule and cares about your well - being starts its day's work.

In the fragrant aroma of a cup of coffee, it tidies up your schedule and even drafts a brief outline for the morning meeting. On your commute, it converts the project document you didn't finish reading last night into voice broadcasts and automatically marks the key data. At your workstation, when faced with a complex data report, it generates a visual chart in seconds. Before your afternoon meeting with clients, after you input the core ideas, it quickly creates a preliminary PPT and matches it with a layout and illustrations that fit the brand style. On your way home, when you tell it your dinner requirements via voice, it will push recipes that match your food inventory and even connect with smart home devices to pre - heat the cookware in advance. At night, when you're about to rest, it has already selected the most important news for tomorrow and says "Good night"...

Like water and electricity, which are natural infrastructure, if 2024 was the year of tentatively exploring AI, then in 2025, we truly live and work side by side with AI. AI applications are undergoing a transformation of "full - scenario penetration and full - process empowerment". Our perception of time, decision - making models, and even our life rhythms are being quietly reshaped. What AI changes is far more than just work efficiency.

This list is not just about evaluating the pros and cons. Although it may not be perfect, it aims to record how AI is reshaping new paradigms at an astonishing speed. When we examine this list, we are not only looking for the most reliable "partner" among numerous choices but also asking: When AI's capabilities penetrate into core areas, how should we harness it? How can we make AI better contribute to this "intelligent transformation" within the framework of regulations?

Side A

Doubao: The "National - level AI Application" That Has Broken Through Circles

Which AI partner will take charge of the interaction between the audience and the stage at the Spring Festival Gala of the Year of the Horse? This time, it's Doubao. It will join hands with Volcengine to appear on the stage of the CCTV Spring Festival Gala in 2026. Recently, there has been a lot of news about Doubao. For example, its DAU (Daily Active Users) has exceeded 100 million, the average daily Token calls of the Doubao large - model have exceeded 50 trillion, and more than 100 enterprises have a cumulative usage of over 1 trillion... It has become the AI assistant with the largest user base in the Chinese market and a "national - level AI application".

These remarkable data are due to its rapid technological "evolution". It took Doubao about half a year to evolve from the Doubao large - model 1.5 to 1.8. For example, in test sets such as complex reasoning, competitive - level mathematics, multi - round conversations, and instruction following, the performance of Doubao 1.6 - thinking ranks among the top globally, achieving the best results in 38 out of 60 public evaluation benchmarks. By the time of Doubao 1.8, its tool - calling ability, complex instruction - following ability, and OS Agent ability have all been enhanced, unlocking all - around skills of "seeing, writing, doing, and planning". For instance, in terms of visual understanding, the number of frames for single - video understanding of Doubao 1.8 has doubled from 640 to 1280 frames. It supports understanding ultra - long videos at a low frame rate and can call tools to understand key segments at a high frame rate. This ability can be widely used in scenarios such as online education and product quality inspection. In multiple public evaluations, Doubao 1.8 has achieved the best or near - best results in visual reasoning, general visual Q&A, spatial understanding, and video understanding.

Meanwhile, its video - generation model has achieved a qualitative change from "silent images" to "precise synchronization of audio and video". The newly launched "Draft Sample" function has a preview that is highly consistent with the final product, which can help creators improve their efficiency by 65%.

Reasons for Making the List

Doubao presents a unique example. It doesn't define itself by being the "first" in a single technical parameter, but it materializes the "large - model" into an "actor". Moving beyond dazzling demonstrations and from "dialogue" to "action" is a more fundamental paradigm shift for AI. When AI no longer just generates text and images but starts to actively call tools, control interfaces, and connect complex cross - platform processes, it essentially intervenes in the operation of the real world. This will undoubtedly trigger new frictions and considerations, but precisely because of its in - depth penetration, it may give rise to unprecedented forms of collaboration and productivity innovation.

Tencent Yuanbao: From "Breaking Out" to "Fitting In"

In 2025, many people found a new "friend" in their WeChat - Yuanbao. In a new way, it defines the form of "intelligence" in the social ecosystem. It's not an independent application that requires downloading, registration, or deliberate opening, but a "partner" always on standby in a "national - level social platform".

Previously, what impressed people most about Tencent Yuanbao was that it was the first among domestic leading AI applications to achieve the collaboration of the "Huanyuan + DeepSeek" dual - models. Users can switch according to their needs. They can use DeepSeek, which has a faster response, when writing code or solving math problems, and switch to Huanyuan, which is good at logical reasoning, when analyzing long documents or making in - depth plans, maximizing the efficiency in different scenarios.

Relying on the natural advantages of the Tencent ecosystem, Yuanbao's scenario penetration has become even smoother. When receiving a complex PDF document, there's no need to leave WeChat. Just forward it to Yuanbao and say "Help me summarize the core points and find the action items", and a clear summary will be returned immediately. When you're in a meeting and don't have time to take notes, send a long voice message to it, and soon you'll get a well - structured written summary. This allows AI capabilities to naturally integrate into daily social and office life, changing from a "ritual - like" call to a natural daily habit like sending a message.

Not long ago, Tencent Yuanbao launched a new "task reminder" function, which is regarded as a sign of its evolution from a "dialogue assistant" to a "personal task intelligent agent". It can understand natural language, break down complex tasks, and promote their execution. For example, if you say "Remind me to work out every Monday, Wednesday, and Friday. If it rains, remind me to do yoga at home", Yuanbao can understand and take the practicality of the intelligent assistant to a new level.

Reasons for Making the List

The transformation from breaking out with the dual - model to integrating AI capabilities into everyone's social life in a more natural and closer way can be seen as a shift from "breaking out" to "fitting in". When AI can be seamlessly embedded in the most frequent social and office scenarios, this "seamless integration" reshapes the entire user behavior model and experience expectations. As technology matures, its ultimate value will depend more on how well it can understand and serve the existing and complex human forms. Future leaders may be those service designers who are good at making technology invisible.

Tongyi Qianwen: The "AI Scholar" That Can Digest 100 Documents

Which AI is good at long - text processing? Tongyi Qianwen may be one of them.

In 2025, Tongyi Qianwen released the Qwen2.5 and Qwen3 series of models with significantly improved performance. In terms of pre - training, the dataset of Qwen3 has been expanded compared to Qwen2. According to data on the Tongyi official website, Qwen2.5 was pre - trained on 18 trillion Tokens, while Qwen3 uses almost twice as much data, reaching about 36 trillion Tokens, covering 119 languages and dialects.

Tongyi Qianwen has many highlights in Chinese understanding and logical reasoning. On the one hand, it has a free document parsing function and can parse web pages, documents, papers, books, etc. Apart from parsing online web pages, for a single document, it can handle extremely long materials of over ten thousand pages, which is equivalent to about 10 million Chinese characters. For multiple documents, it can quickly read 100 materials in different formats with one click. On the other hand, it enhances the Transformer architecture. To address problems such as inaccurate parsing of ancient texts/rare words and chaotic translation of professional terms, Tongyi uses the Rotational Position Embedding (RoPE) technology to capture the chronological logic of classical Chinese, and the translation of technical documents maintains the consistency of professional terms with an accuracy rate of over 96%.

In addition to the large - model for text generation, the large - model for image generation of Tongyi has a parameter scale of 20 billion. Its large - model for video generation supports generating a video with a single sentence and can also generate a smooth dynamic video based on the provided first frame or first and last frames of an image.

Reasons for Making the List

In work and study, processing long documents is a necessity for many people. Tongyi Qianwen shows a unique value: it makes it easy to handle massive and complex information. Whether it's a ten - thousand - page literature or a hundred mixed documents, it can quickly sort out the context and extract the essence, expanding the depth boundary of personal research and learning. Its multi - modal creativity from text to image and video integrates its core capabilities into the needs of users seeking efficiency and depth.

WPS AI: Working While "Chatting"

In 2023, Kingsoft Office launched WPS AI and introduced a series of AI functions around AIGC (Content Creation), Copilot (Intelligent Assistant), and Insight (Knowledge Insight). In 2024, WPS AI 2.0 was born, focusing on specific enterprise scenarios and using AI to promote the intelligent application of enterprise knowledge. In late July 2025, WPS AI 3.0 with WPS Lingxi as the core was launched.

Data shows that as of the end of March 2025, the global monthly active devices of WPS Office reached 647 million.

In the new version, the upgrade of intelligent creation is one of the core highlights. In some components of WPS Office, a simultaneous - screen interaction form is formed, with the Office suite on the left and WPS Lingxi on the right. That is to say, users can directly put forward their requirements in natural language in the dialogue box on the right. After the AI identifies the intention, it can modify the document area on the left without switching to other applications. Compared with other products, Lingxi has the advantages of multi - round dialogue, controllable modification, and format preservation, and can control the AI to generate truly usable results.

On the one hand, it has a low threshold, and users can quickly get started and create documents through dialogue. On the other hand, in terms of data processing, the WPS Knowledge Base can upgrade users' cloud documents into a knowledge base. Everyone can search for answers and filter data on the WPS Knowledge Base and write plans or documents based on private - domain knowledge.

In addition, WPS Office has introduced a new PPT creation mode. Users can modify the PPT outline while chatting with the AI and can also fine - tune the template, single page, and layout for a second time, easily "chatting" their way to the desired effect.

Reasons for Making the List

Intelligent office is not something new. WPS AI doesn't create a new product that requires deliberate learning. Instead, it makes AI an "Lingxi" assistant always on standby in the Office suite. All operations are completed on the same screen, and the generated results can be used immediately. This "chatting is creating" experience greatly lowers the threshold of intelligent office. Moreover, it makes the massive documents stored in the cloud by each user "come alive" through the knowledge - base function, transforming them into private - domain knowledge assets that can be called at any time to support decision - making, becoming a powerful tool for everyone in the cubicle to create efficiently.

As AI rapidly integrates into human work and life, our scrutiny also turns to the other side of the "coin". Behind the rapid development of AI, there are inevitably imperfect shortcomings and challenges that need to be addressed. These issues may be the gap in user experience during technology implementation or the compliance boundaries in innovation exploration, but they are also the inevitable path for the industry to mature.

Side B

As AI rapidly integrates into human work and life, our scrutiny also turns to the other side of the "coin". Behind the rapid development of AI, there are inevitably imperfect shortcomings and challenges that need to be addressed. These issues may be the gap in user experience during technology implementation or the compliance boundaries in innovation exploration, but they are also the inevitable path for the industry to mature.

Manus: Can It Retain Users After Being Acquired by Meta?

In 2025, the development trajectory of the intelligent agent Manus was a dramatic turn from frenzy to coolness. At the beginning of the year, Manus quickly gained popularity with the concept of a "general AI intelligent agent". A demonstration video of it independently completing tasks such as resume screening and stock analysis caught the whole - network's attention. The invitation code for the internal test was sold at a sky - high price of 100,000 yuan, and its valuation soared to 500 million US dollars at one point.

After the hype faded, the core defects of the product gradually emerged. Reports show that in terms of the technical path, Manus focuses on model integration and post - training and doesn't have its own self - developed model. The low technical threshold was also confirmed as multiple teams successfully replicated Manus and made it open - source in a short period.

In addition, some media reported that after using the product, some users questioned its slow running speed, astonishing token usage, and mediocre performance. Public information shows that the cost of running a single task on Manus is about 2 US dollars (about 14 yuan in RMB), which is difficult to meet the low - cost requirements in real - world scenarios.

These shortcomings directly affected users' willingness to stay. In March 2025, Manus' visits reached 23.76 million, but dropped to 16.16 million in May.

How to reverse the decline? Manus carried out several upgrades in the second half of 2025. In October, the Manus 1.5 version was launched, optimizing pain - points such as speed and reliability. In December, a text - to - image function was added and integrated into the intelligent agent workflow. In mid - December, Manus announced that its ARR (Annual Recurring Revenue) exceeded 100 million US dollars, with a total consumption of 14 trillion Tokens.

The latest news is that Manus has been officially acquired by Meta, setting the third - largest acquisition record in Meta's history. This may be the best outcome for Manus.