Tencent Hunyuan Updates: Emphasizing Both Multimodality and Intelligent Agents | The Frontline
Author | Deng Yongyi
Editor | Su Jianxun
Tencent's strategic implementation of large models is accelerating.
"As AI continues to be implemented, every enterprise is becoming an AI company, and every individual will become a 'super individual' empowered by AI." On May 21st, at the Tencent Cloud AI Industry Application Summit, Tang Daosheng, Senior Executive Vice President of Tencent Group and CEO of the Cloud and Smart Industry Group, said.
On May 21st, Tencent's Hunyuan model received a comprehensive upgrade. This includes the flagship fast - thinking model Hunyuan TurboS and the deep - thinking model Hunyuan T1, both of which released new iterative versions.
Based on the TurboS base, Tencent newly launched the visual deep reasoning model T1 - Vision and the end - to - end voice call model Hunyuan Voice. At the Tencent AI Industry Application Summit, a series of multimodal models such as Hunyuan Image 2.0, Hunyuan 3D v2.5, and Hunyuan Game Visual Generation were also updated simultaneously.
Tang Daosheng Source: Tencent
Tang Daosheng, Senior Executive Vice President of Tencent Group and CEO of the Cloud and Smart Industry Group, said that on Chatbot Arena, a globally recognized authoritative large - language model evaluation platform, Hunyuan TurboS has climbed to the top eight globally, ranking second only to DeepSeek in China. In terms of science - related abilities such as code and mathematics, Hunyuan TurboS has also entered the global top ten.
In early 2025, Hunyuan TurboS was officially released, adopting a large - scale hybrid Mamba - MoE model, which has shown significant advantages in terms of effect and performance. This latest breakthrough is due to the increased training of tokens in the pre - training stage and the introduction of the long - short thinking chain fusion technology in the post - training stage, which has increased TurboS's science reasoning ability by over 10%, code ability by 24%, and competition mathematics performance by 39%.
Source: Tencent
As early as the second half of last year, Tencent made great efforts in the research of the deep - thinking model. Since the Hunyuan T1 deep - thinking model was launched on the Yuanbao App at the beginning of the year, it has been continuously and rapidly iterated. Recently, Hunyuan T1 also received a new upgrade, with improvements in multiple core capabilities: an 8% improvement in competition mathematics, an 8% improvement in general knowledge Q&A, and a 13% improvement in the Agent ability for complex tasks.
Currently, the domestic large - model market is characterized by a wide variety of models, and each large model has its own technological strengths.
For example, Hunyuan's multimodal models, such as 3D and video generation functions, have a good reputation among developers.
The newly released Hunyuan visual deep reasoning model T1 - Vision supports multi - image input and has a native long - thinking chain. In terms of the specific product presentation effect, it can "think while looking at pictures", with an overall improvement of 5.3% compared to before and an overall understanding speed increase of 50%.
Another voice model, Hunyuan Voice, is an end - to - end voice call model. The model can achieve low - latency voice calls. Compared with the cascaded solution, the response speed is increased by more than 30%, dropping to 1.6 seconds. The anthropomorphic and emotional application abilities have also been significantly improved. It has currently been launched in a gray - scale test on the Tencent Yuanbao App. The real - time video call AI experience will also be launched soon.
An interesting point is that when Tencent introduced the image - generation function of Hunyuan 2.0, it mentioned a figure - in the manual evaluation of subjective picture quality and aesthetics, Hunyuan Image 2.0 is also considered one of the models with the least "AI flavor".
To some extent, this also means that after the flourishing of basic models, factors such as the diversity and aesthetics of model output results have begun to be included in the evaluation criteria.
The Knowledge Engine is Fully Upgraded to the "Intelligent Agent Open Platform"
Another highlight of this summit is Tencent's intelligent agent strategy.
2025 is known as the first year of Agent intelligent agents. With the explosion of reasoning models and multimodal models, intelligent agents have become the most concerned direction in the large - model field this year.
A key move at this summit is that Tencent upgraded its original large - model knowledge engine to the "Tencent Cloud Intelligent Agent Development Platform".
It is reported that the upgraded platform integrates Tencent Cloud's RAG (Retrieval - Augmented Generation) technology and comprehensive Agent (intelligent agent) capabilities, which can help enterprises quickly activate private - domain knowledge and build exclusive intelligent agents.
Why upgrade to a brand - new intelligent agent platform at this time?
Wu Yunsheng, Vice President of Tencent Cloud, Head of Tencent Cloud Intelligence, and Head of YouTu Laboratory, said in a post - summit interview that the reason for upgrading the intelligent agent platform is to help enterprises truly afford and make good use of intelligent agents, rather than just staying at the conceptual stage.
The development of technology has promoted the rapid implementation of intelligent agents, which is also an important reason. "In the past, when using traditional AI technology to achieve these capabilities, the results were not ideal. For example, tasks such as keyword extraction and summary generation require high language understanding ability," Wu Yunsheng said.
However, with the emergence of large models, especially multimodal large models, there have been significant improvements in semantic understanding, context modeling, content segmentation, and label generation. The most direct impact is that large models have rapidly improved the accuracy of semantic retrieval and comparison; the development of multimodal models has made visual + text collaborative tasks possible.
"If an Agent has the ability to use a browser, its 'behavior boundary' will be greatly expanded and can cover many real - world scenarios," Wu Yunsheng said in the interview.
Open - source is also one of the focuses of this press conference.
Currently, the download volume of the Hunyuan 3D model on Hugging Face has exceeded 1.6 million. In the future, Hunyuan plans to launch multi - size hybrid reasoning models, including dense models from 0.5B to 32B and activate the 13B MoE model to meet the different needs of enterprises and end - devices.
Moreover, multimodal basic models such as Hunyuan Image, Video, and 3D, as well as supporting plugin models, will continue to be open - sourced.
Currently, Hunyuan has been deeply integrated into Tencent's various business lines and is widely used in core products such as WeChat, QQ, Tencent Yuanbao, Tencent Meeting, and Tencent Docs, improving the intelligence level of Tencent's internal products. It also outputs model capabilities through Tencent Cloud to help enterprises and developers innovate and improve efficiency.