The fifth battle of Chinese cloud providers has begun.
Since the starting gun sounded in 2007, 19 years have passed in a flash, and China's cloud computing has been on a rapid development track for 19 years.
Time has flown by, and the cloud has spread across the sky.
Now, the horn of the fifth change has sounded again.
In the past 19 years, from hardware virtualization, platformization, cloud-native, to AI integration, it's no exaggeration to say that cloud computing is the core foundation for China's Internet to build a unified computing power base across the board and finalize the digital underlying technology architecture.
At the same time, cloud computing has also reshaped the existence form and the underlying logic of value circulation of China's Internet digital resources, driving China's Internet to fully enter the era of industrial digitization from the era of consumer traffic.
From cost savings to efficiency improvement, from application innovation to business empowerment, in the era of the booming development of large models, the value of cloud computing has undergone another leap - towards value creation.
From IaaS to PaaS, from XaaS to MaaS/AIaaS, in 2026, Chinese cloud computing providers have once again reached an industrial consensus - Agent Infra has become the core strategic positioning.
Since Q2, the horns of the "fifth battle" among cloud providers have been sounding one after another.
On May 13th, Robin Li, the founder of Baidu Group, proposed that DAA (Daily Active Agents) has become the metric in the AI era, rather than DAU (Daily Active Users). As the second-generation entry point, the value ceiling of agents is much higher than that of chatbots. At the same time, he also put forward three levels of "self-evolution" in the AI era, and the self-evolution of agents is a key aspect.
One week later, at the just-concluded 2026 Alibaba Cloud Summit, Liu Weiguang, the senior vice president of Alibaba Cloud Intelligence Group and the president of the Public Cloud Business Unit, said bluntly that "cloud infrastructure is an important technological cornerstone in the Agentic era." Only a brand-new cloud infrastructure can meet various requirements such as the stable, secure, and timely scheduling of agent operations. And its newly launched "Qianwen Cloud" is even called "a brand-new service mode born for agents."
In fact, in the following June, cloud providers such as Tencent Cloud and Volcengine couldn't hold back either. Tencent Cloud's AI Industry Application Conference and Volcengine's Force Prime Conference are upcoming. Agents have become another growth point in AI, and the transformation of infrastructure services closely related to them has become a new "story."
From AI Infra to "New Agent Infrastructure"
In the past three years, the core of cloud computing has been AI Native Cloud, and its essence is "computing power optimization." It provides computing power clusters with high concurrency, large throughput, and low latency around large model training and inference. The core indicators are cluster scale, computing power density, and network bandwidth.
However, as agents penetrate into more scenarios, traditional AI infrastructure can no longer meet the native requirements of agents. The industry consensus is that the cloud must shift from "serving models" to "serving agents." "Computing power utilization" is no longer the only core, and "agent task success rate, execution efficiency, and governance controllability" have become the measurement factors.
According to Yiou's understanding at the Alibaba Cloud Summit, currently, Alibaba Cloud has disassembled it into two layers. One is AI native cloud, which continues to deepen the computing power support for model training and inference. The other is agent native cloud, which is specially designed to build infrastructure for the orchestration, operation, and governance of agents.
Correspondingly, Baidu previously took Agent Infra as the core and proposed the full-stack architecture of "Chip-Cloud-Model-Agent," upgrading MaaS to Token Factory. The core is to enable each token to be efficiently converted into executable intelligent actions.
The essence of this transformation is a fundamental change in the service object of cloud computing: from serving "software written for humans" to serving "agents with autonomous decision-making and automatic execution."
Traditional clouds face deterministic tasks, with long-term resource occupation and stable loads. Agent tasks have four characteristics: short life cycle, irregular bursts, dynamic dependencies, and task-level security. An agent may initiate a task and destroy it within a second, or it may run continuously 7×24 hours. It also needs to frequently call databases, browsers, and third-party tools, which requires cloud infrastructure to shift from "resource scheduling" to "task scheduling."
Robin Li, the founder of Baidu, said bluntly at the 2026 Create Conference: "Tokens may not represent the end. DAA (Daily Active Agents) is the new metric in the AI era." In the past, the industry competed in terms of who burned more tokens and who had a larger cluster. Now, the competition has shifted to who can support more agents to work stably and deliver results.
Alibaba Cloud also put forward a similar judgment: "For the first time, we have entered the stage of large-scale management of intelligence from large-scale management of computing power." The consensus of the two giants heralds a complete reconstruction of the underlying logic of cloud computing.
The "Incompatibility" of Traditional Infrastructure
The transformation of AI Infra is not accidental. There is a fundamental mismatch between the traditional cloud computing architecture and the native requirements of agents. Therefore, the transformation of AI Infra is a necessary requirement for agents to enter scenarios.
On the one hand, the traditional resource scheduling in the "computing power era" has "failed." The workload of agents is completely different from that of traditional AI tasks and Internet tasks. With a short life cycle, most agent tasks take seconds to minutes and are destroyed after use. They burst irregularly, and the traffic may explode ten thousand times in an instant or remain dormant for a long time. They also have strong state dependencies and need to continuously remember context and tool call history. With multimodal interaction, they frequently call multiple tools such as text, images, videos, and databases.
The traditional cloud is designed based on "long-term deployment and stable load" and cannot adapt to this "pulsed, stateful" load. For example, traditional containers take minutes to start and cannot support the second-level start and stop of agents. Resources are billed based on instances for a long time, which cannot match the cost requirements of agents with "short-term high load and long-term dormancy." Alibaba Cloud's research found that when enterprises build their own agent platforms, the container cost alone far exceeds expectations.
On the other hand, there is a contradiction between cost and efficiency. The inference cost of large models is high, and the multi-round calls and repeated context calculations of agents further exacerbate the cost pressure. Baidu's data shows that in traditional MaaS services, about 30% of tokens are used for repeated calculations, resulting in low inference efficiency. Alibaba Cloud also mentioned that when the KVCache hit rate is less than 70%, the memory bottleneck of inference will lead to a sharp drop in efficiency.
At the same time, enterprises face the situation that "95% of agent tasks are repetitive labor, and 5% are core decisions." Traditional infrastructure cannot reuse historical calculation results, resulting in "starting from scratch for each call" and high costs. Shen Dou, the executive vice president of Baidu, pointed out: "In the agent era, the cost is not the computing power cost but the token efficiency cost."
In addition, security governance is also one of the reasons that cannot be bypassed. Agents can autonomously access enterprise core data, call business systems, and perform operations. The traditional cloud security system (account permissions, network isolation) is based on the logic of "humans using software" and cannot meet the requirements of agent identity authentication, refined permissions, behavior auditing, and data leakage prevention.
For example, if an agent accidentally deletes a database or leaks customer data, the traditional security system cannot trace the responsibility or intercept it in real time. In addition, when multiple agents collaborate, the traditional governance tools are completely blank in terms of problems such as memory sharing, permission isolation, and task conflicts. Alibaba Cloud summarized the six core challenges of enterprise agent implementation, and security and governance account for three of them, which is also the core reason why most enterprises "dare to do demos but dare not mass-produce."
Agents will reshape the industry, and new infrastructure is the entry ticket. Robin Li previously predicted at the Baidu AI Developers Conference that the global DAA will exceed 10 billion in the future, and multiple agents will undertake work in each position and scenario. Alibaba Cloud predicts that in the next 2 - 3 years, agents will experience explosive growth, and enterprise workflows will fully shift from "human-centered" to "agent-centered."
Data from PwC shows that 79% of US enterprises are already using agents in their businesses, and 88% plan to increase their investment. Gartner predicts that by 2028, 33% of enterprise software will natively integrate agent capabilities. Facing this definite trend, cloud providers must take the lead in deploying Agent-Native infrastructure; otherwise, they will lose the entry ticket for the next round of industrial competition.
Full-Stack Reconstruction: A Systematic Revolution from Chips to Products
The transformation paths of Alibaba Cloud and Baidu are highly similar. The AI full-stack capabilities based on "Chip-Cloud-Model-Agent" have become an important direction.
Among them, the most critical hardware is the chip. The requirements of agents for chips are "high inference performance, low latency, high concurrency, and low cost." Traditional GPUs cannot meet these requirements, and both giants are focusing on self-developed chips.
Alibaba Cloud has launched the Zhenwu M800 training and inference integrated AI chip, with a supporting ICSwitch interconnection chip, and it is installed in the Panjiu AL128 super-node server. It is understood that this chip is specially optimized for agent inference, supports a high-speed network of 800Gbps, and a single cluster can support a scale of 100,000 cards. The linear expansion efficiency of 10,000 cards exceeds 96%, which mainly solves the computing power requirements of agents for high-concurrency inference and short-term high load.
Correspondingly, Baidu's self-developed Kunlun Chip has been iterated to the P800, continuously delivering clusters of 10,000 cards. The 256-card Tianchi super-node will be launched in June, and the inference efficiency will be increased by 50%. The Kunlun Chip is deeply adapted to the Wenxin large model and also supports mainstream models such as DeepSeek and GLM. The core is to improve token production efficiency and reduce agent call costs.
In addition, the architecture layer, product layer, and overall ecological capabilities will all be adjusted and changed as agents enter more scenarios and workflows.
It is worth noting that the transformation of cloud providers' agent infrastructure is not only a technological change but also an industrial change.
One manifestation is "the reduction of costs." Although the reduction of costs is not the immediate result at present. Xu Qing, the president of the Terminal Intelligent Computing Business Unit of Alibaba Cloud Intelligence Group, said in an interview with the media when introducing JVSClaw: "The launch of JVSClaw has brought some needs from B-side customers. The technical team can abstract some capabilities required by Agent Infra from customer needs. Therefore, it is worth doing this even if it is 'losing money' at present."
In addition to cost, controllable security governance has become a major advantage. The capabilities of agent identity authentication, refined permissions, behavior auditing, and data isolation solve the core concerns of enterprises. Alibaba Cloud's Agent Security Center can trace every agent operation, and Baidu's AI Security Guardrail can intercept abnormal behaviors in real time.
However, when agents become the next "explosive point" captured by the market and the industry, the trend of industry competition and "shuffle acceleration" becomes obvious. Cloud providers are shifting from selling computing power to selling "intelligence services" - agent orchestration, memory management, task execution, etc. will become new revenue growth points. Giants with full-stack capabilities (Alibaba, Baidu) take the lead. If small and medium-sized cloud providers cannot keep up quickly, they will be marginalized. Solution providers in vertical fields will rise, focusing on agent scenarios in industries such as finance, manufacturing, and healthcare.
The transformation of cloud computing towards agents is essentially a triple resonance of technology, industry, and capital. Robin Li said: "In the future, the competition is not about computing power but about intelligence." And Alibaba Cloud believes that "tokens should turn into intelligence, and intelligence should turn into actions."
This revolution has just begun. Alibaba Cloud's Agentic Cloud and Baidu's Agent Infra are just the starting points, not the end points. In the future, the core competitiveness of cloud providers will no longer be cluster scale and computing power density but the intelligence operation ability to support the efficient, secure, and stable operation of massive agents.
For enterprises, embracing the agent era means embracing new productivity. For cloud providers, winning the agent infrastructure war is the key to seizing the industrial high ground in the next decade. From computing power to intelligence, the second half of cloud computing has just begun.
This article is from the WeChat official account "Yiou.com" (ID: i-yiou), written by Liu Juan and edited by Liu Huan. It is published by 36Kr with permission.