HomeArticle

In three days, I clearly saw how AI will intervene in our lives in the future.

壹览商业2025-08-01 07:18
Understand What Was Released at WAIC in One Article.

On July 28, the three - day World Artificial Intelligence Conference 2025 (hereinafter referred to as WAIC) came to an end.

Overall, calling it a "world - class" event still falls a bit short. On the one hand, apart from Google, only Tesla brought its "car - Optimus robot" to set up a booth. On the other hand, as there was another World Robot Contest (WRC) in Beijing a week later, many manufacturers had to choose one event, leaving their major new products for the next stop.

However, WAIC's popularity has been increasing year by year. This year, more than 1,500 experts from over 70 countries and regions and more than 800 enterprises gathered at the Shanghai World Expo Exhibition and Convention Center. Twelve Turing Award and Nobel Prize laureates appeared on the same stage. The exhibition area exceeded 70,000 square meters for the first time, and the estimated number of visitors in three days was 350,000. Tickets were sold out two days before the opening, and scalpers were shouting "Tickets for sale!" on - site, with the price soaring to over a thousand yuan per ticket.

Beyond the hustle and bustle, this year's WAIC also sent out many signals.

Yilan Business summarized four of the most obvious changes on - site:

1. Generative AI has become even more "omnipresent", infiltrating from writing and drawing into hardcore fields such as industry, healthcare, and transportation; 2. With the advancement of computing power, domestic chips and full - stack collaboration have become high - frequency terms, and the focus of competition has shifted from "single - chip performance" to "full - link efficiency".

3. Robots are no longer just "standing in poses". Their abilities in motion control, emotional interaction, and performing "tricks" have been significantly upgraded; 4. Robotaxis have made great strides. As manufacturers obtain more licenses, in - vehicle multimodal models are reshaping the imagination of the transportation system.

Omnipresent AI

The most intuitive feeling at this year's WAIC is: Large - scale models are no longer just about "comparing parameters" but are deeply integrated into various specific scenarios - from in - vehicle cockpits to production workshops, from online customer service to coffee - making robots in the exhibition hall, their presence can be seen everywhere.

First, there has been a leap in the capabilities of large - scale models themselves. The new - generation basic model Step 3 released by Jieyue Xingchen adopts a MoE architecture with 321 billion parameters and 3.8 billion active parameters. It is the first full - scale native multimodal inference model. It can understand text, images, and mathematical symbols simultaneously and achieve a generational improvement in inference and decoding efficiency on domestic chips, which means that a "smarter but more computing - power - efficient" model is moving towards practical applications.

In terms of "how to use" large - scale models, MiniMax offered another solution - directly transforming the model into a full - stack intelligent agent, MiniMax Agent, which can perform tasks. It can break down tasks, call APIs, initiate payments, and schedule long - process operations. In the on - site demonstration at the exhibition, it could generate an enterprise data dashboard or automatically build a simple e - commerce website and complete the payment cycle with just one sentence, showing the prototype of an "AI colleague". The rapid iteration 12 times within a month also reflects the competitive rhythm of this field.

Hehe Information singled out the security issue. Its AI anti - forgery technology can identify deep fakes within milliseconds, which is particularly crucial for industries such as finance and government affairs. The booth attracted a large number of visitors with an interactive game of "spot the difference in famous paintings". After scanned by the model, the tampered "Mona Lisa" and "Sunflowers" would show abnormal light and texture in the forged areas and give a authenticity score. This technology is also applicable to high - risk scenarios such as face - changing and bill tampering.

Baidu demonstrated a complete set of "application generation pipelines" at the exhibition. On one hand, GenFlow 2.0 can dispatch multiple intelligent agents to generate PPTs, charts, web pages, and scripts at once; on the other hand, the Miaoda platform allows users to generate a runnable application in three minutes with just a one - sentence description of their needs. In the on - site demonstration of a "conference sign - in mini - program", everything from the interface to the logic was automatically generated. The upgraded version of the digital human Nova also made its debut, with more natural movements and voice cloning closer to real human anchors, preparing for live - streaming and short - video marketing scenarios.

Agora, focusing on real - time interaction, released a new version of its conversational AI engine, adding three new capabilities: voiceprint recognition, visual understanding, and digital human interaction. The interactive plush pet "Fu Zai" became a popular exhibit on - site. It can distinguish different people's voices in a noisy environment, respond precisely to instructions, and give anthropomorphic feedback by recognizing gestures and facial expressions through the camera, showing clear application potential in education, customer service, and entertainment.

Alibaba Cloud officially launched its first "super - brain" for AI Agents - Wuying AgentBay. This cloud - based computer can run tasks such as code execution, web browsing, data analysis, and table making simultaneously. It has multiple AI capabilities such as visual understanding, natural language control, and task parsing, and can seamlessly switch between systems such as Windows, Linux, and Android. It can be accessed with just three lines of code, achieving a "move - in - ready" state.

At this conference, Alibaba Cloud also fully presented its full - stack capabilities from infrastructure to models and then to platform applications, including the cloud - native CPU Yitian 710, the Feitian cloud computing operating system, the HPN7.0 intelligent computing network architecture, the AI Stack all - in - one machine, as well as the Tongyi Qianwen and Tongyi Wanxiang large - scale model series, and the Bailian and PAI platforms.

Overall, the generative AI exhibition area at this year's conference sent out two obvious signals: Firstly, multimodality and agentization have become a consensus, and large - scale models are becoming tools that can "do work" rather than just gimmicks for showing off parameters; secondly, the topics of security and cost have been prioritized, and manufacturers are starting to face up to fake management and inference efficiency.

GPU Field: All - out Attack of Domestic Computing Power

If generative AI is the "brain", then computing power is the "muscle". At this year's WAIC, domestic GPU manufacturers almost lined up to showcase their products, demonstrating a collective "muscle - flexing" from chip architecture to intelligent computing center solutions.

The Huawei exhibition area demonstrated the latest progress of Ascend AI. The core highlight is the Ascend cloud service based on the CloudMatrix 384 super - node. 384 Ascend NPUs and 192 Kunpeng CPUs are fully and peer - to - peer interconnected through the new high - speed network MatrixLink, forming a super "AI server" with a computing power scale of 300 PFlops. It successfully breaks through the bandwidth performance bottleneck of cross - machine communication and realizes a transformation from a server - level to a matrix - level resource supply mode.

The Huawei Cloud CloudMatrix 384 super - node also has four major technical features: Firstly, it has strong throughput performance and can achieve software - hardware collaborative optimization, with a single - card Decoding throughput reaching 2300 Tokens; Secondly, it covers a wide range of mainstream models, having accumulated more than 160 mainstream industry models, which can efficiently support model migration; Meanwhile, it has high efficiency in expert parallelism. It is the first in China to propose a large - scale expert parallel solution, which can achieve system - level optimization to support larger throughput and lower decode latency; and it has flexible scalability with a small initial investment. New versions are released annually, enabling flexible and on - demand usage, which can better support the implementation requirements of "AI +".

Muxi Technology brought its flagship new product - the Xiyun C600 GPU. This chip uses an independent GPU IP architecture to build a fully domestic supply - chain closed - loop from design, manufacturing to packaging and testing. The Xiyun C600 supports large - capacity storage and multi - precision mixed computing power and is equipped with MetaXLink super - node expansion technology to meet the training and inference needs of next - generation generative AI. It is worth noting that it has a built - in ECC/RAS security protection module, targeting scenarios such as finance and government affairs, and comprehensively competing with international flagship GPU products.

The booth of Moore Threads is more like a "full - stack computing power supermarket". It uses a full - function GPU as the foundation, covering a complete product line from the cloud to the terminal: The flagship server MCCX D800 X2 for large - model training, based on the self - developed OAM module and high - speed fully interconnected architecture, can support models with trillions of parameters; The newly launched intelligent computing acceleration card MTT S4000 can handle both training and inference, with a single - card 48GB video memory and 768GB/s bandwidth, supporting FP8 and FP64 mixed precision; The cloud rendering card MTT S3000 and the desktop - level graphics card MTT S80 target the cloud - gaming, digital twin, and personal consumer markets respectively; and the AI computing module for edge scenarios in industries such as industry and transportation can achieve cloud - edge collaborative deployment with a computing power of 50 TOPS.

The most remarkable aspect of Moore Threads is its integrated layout of "cloud - edge - terminal". The same MUSA architecture can run large - model training, support graphic rendering, and scientific computing, making its products naturally suitable for the needs of multimodality and embodied intelligence.

Suoyuan Technology, with the theme of "The Flame of the Core Spreading Far and Wide", demonstrated practical commercial implementation. The booth mainly demonstrated the large - scale commercial application of the Suoyuan® S60 artificial intelligence inference card, especially in scenarios such as chatbots, code generation, search recommendation, and advertising placement. Suoyuan also brought the DeepSeek all - in - one machine series, which features domestic CPU adaptation and multi - scenario optimization, enabling enterprises to access AI inference services at a lower threshold. They also announced deployment cases in intelligent computing centers in Qingyang, Wuxi, Yichang and other places, emphasizing the route of "commercialization first, then iteration".

Overall, the focus of the computing power competition has shifted from single - card performance to full - link efficiency and cost - especially against the backdrop of the continuous decline in large - model inference costs, the "cost - performance" advantage of domestic GPUs is being magnified by the market.

Embodied Intelligence: Performing and Working

If last year's robot exhibition area was still at the stage of "robots being able to walk and stand", this year's embodied intelligence has started to "perform and work". Whether it is on the stage or in industrial operations, robots move more smoothly, react more sensitively, and even show some "emotional expressions".

Qianxun Intelligence made its debut at WAIC, bringing the humanoid robot Moz1. This is its star product for multi - task scenarios. Relying on the self - developed Spirit v1 VLA model and a top - notch motion control system, Moz1 can perform high - difficulty actions such as the moonwalk, S - shaped cornering, and active balance challenges, with a smoothness almost approaching that of a human. The booth was designed with multiple interactive areas: In the beverage area, after visitors scanned the code to place an order, Moz1 could accurately pick up and deliver the beverage; In the remote - control area, visitors could experience "controlling the robot to play a palm maze" through zero - latency synchronous following; In the clothes - folding area, Moz1 could recognize messy clothes and fold them neatly into "tofu - shaped" bundles, demonstrating strong generalization ability.

Fourier Intelligence chose another path - combining emotional expression with rehabilitation. It brought the upcoming interactive companion robot GR - 3 to WAIC and introduced a soft - skin coating and Morandi warm - color design for the first time, making the robot look less "cold". The GR - 3 is mainly for health - care and companion scenarios, providing multi - dimensional services such as medical guidance, cognitive rehabilitation training, upper - limb rehabilitation, and motor function reconstruction. The on - site "Embodied Intelligence Rehabilitation Port" exhibition area demonstrated a complete solution from medical consultation to remote rehabilitation. The cooperation between rehabilitation robots and exoskeleton devices allowed people to intuitively see how embodied intelligence is changing the medical rehabilitation experience.

Zhiyuan Robotics' "dual - mode (bipedal and wheeled)" robot Lingxi X2 - N played a new trick on the stage performance. It can freely switch between bipedal walking and the wheeled "fire - wheel" mode - it can walk flexibly among people and also slide at high speed to complete chasing performances. At the conference opening ceremony, Lingxi X2 - N performed on the same stage as a children's art troupe. With the cooperation of lighting, it was both technological and full of childlike fun, becoming a "robot show" that went viral on social media platforms.

YouiTech demonstrated its industrial embodied intelligence model MAIC (One Brain, Multiple Forms), which is a multi - form robot collaboration system for factory scenarios. It emphasizes "one brain, multiple forms". Through multimodal perception and adaptive multi - arm collaboration, it realizes the connection of operations from handling, picking to full - logistics scheduling. In the on - site booth demonstration, mobile operation robots and humanoid robots cooperated to complete complex assembly - line tasks, showing the leap from single - body intelligence to group intelligence.

Unitree Technology was the "top - stream" at this year's WAIC. It directly brought the previously CCTV - live - broadcast humanoid robot fighting competition to the site. Multiple fighting performances attracted a large crowd in front of the booth. Its star product, the G1, is 1.32 meters tall and weighs 35 kilograms. It has 29 flexible joints and can perform high - difficulty actions such as straight punches, hook punches, and side kicks. With the support of an intelligent balance algorithm, it can quickly get up after being knocked down. Audiences surrounded the fighting ring, constantly exclaiming as the robots punched and dodged, and the on - site atmosphere was comparable to that of an e - sports finals.

In addition to fighting humanoid robots, Unitree also brought the industrial quadruped robot Unitree B2 and the consumer - grade quadruped robot Unitree Go2. The B2 is mainly for industrial inspection, fire - fighting, and emergency rescue. It can carry a load of 120 kilograms, cross a 1 - meter - wide trench, and work stably in an environment from - 20°C to 55°C; The Go2 targets the consumer market, with a price of less than 10,000 yuan. It supports voice interaction with the GPT large - scale model, can perform actions such as handstands and backflips, and also has a "wheeled - legged" version, the Go2 - W, which can run, jump, and slide. It