What did Google and Alibaba say on the same day?
At 1:00 a.m. Beijing Time on May 20th, as night fell in Mountain View, the Google I/O Developer Conference arrived as scheduled. On the stage, Google CEO Sundar Pichai repeatedly mentioned a word: Agent.
A few hours later, in Hangzhou on the west coast of the Pacific Ocean, Alibaba Cloud put forward a near - declarative slogan at the summit: "The users of the cloud are changing from humans to Agents."
Without prior agreement, they are surprisingly similar.
Looking back further in time, at the end of April, Google, Meta, Microsoft, and Amazon released their financial reports on the same day. Only Google became the obvious "winner" due to the explosion of its cloud business. A few days later, Alibaba's financial report for the fourth quarter of fiscal year 2026 showed that Alibaba Cloud's external commercial revenue increased by 40% year - on - year, and the revenue from AI - related products accounted for more than 30% for the first time. Alibaba CEO Wu Yongming clearly stated that the full - stack AI technology investment has crossed the cultivation period and entered the cycle of positive large - scale commercial returns.
As two of the few technology companies globally that can successfully run the complete closed - loop of "self - developed AI chips + self - built cloud + self - developed large models" in the open market, Google and Alibaba are respectively becoming the "top students" in this round of AI investment cycles in the United States and China. And at the same time, they both point the next stop of the AI competition to the same new service object: no longer "humans", but Agents.
From "Competing in Models" to "Competing in Implementation"
In the past two years, the narrative in the AI industry has been simple and crude: Whose model has more parameters? Whose score on the leaderboard is higher? Whose reasoning ability is stronger? The entire industry has launched a crazy arms race around large models.
But a turning point is happening.
The "2026 Artificial Intelligence Index Report" from Stanford University made a key judgment: The gap between top - tier large models in China and the United States has been "substantially eliminated", and the leading models are running side by side. Meanwhile, the models themselves are experiencing rapid "inflation" - although there are still differences in capabilities and models are still important, they no longer constitute a decisive commercial barrier.
What really makes a difference is another thing: Who can make AI really "work"?
The most obvious change at this year's Google I/O Conference is that Google has shifted its focus from "model capabilities" to "Agent capabilities". One of the core products is the personal Agent product, Spark.
Spark runs on Google Cloud and can directly call services in the Google ecosystem such as Gmail, Docs, and Sheets. It can autonomously collaborate between different applications and automatically complete complex tasks such as information retrieval, email writing, and meeting minutes compilation. Even when the user turns off the device, it can still work continuously, like a real "digital employee" that is always online.
The key behind Spark is Google's new Antigravity 2.0 framework.
Google conducted a demonstration on - site: Let Antigravity be equipped with Gemini 3.5 Flash to build an operating system from scratch. 93 sub - Agents worked in parallel, issued more than 15,000 model requests, used 2.6 billion Tokens, and after 12 hours, a complete OS kernel was automatically generated. The scheduler, memory management, and file system were all completed by Agents, including code testing and auditing, with a cost of less than $1000.
Meanwhile, at the Alibaba Cloud Summit held in Hangzhou, Alibaba Cloud presented another perspective. Liu Weiguang, a senior vice - president of Alibaba Cloud, said that after Agents break through the critical point, they can work 24/7, with an infinite demand for AI and cloud services.
On that day, Alibaba Cloud launched a full - stack of products around Agents, including underlying chips, Agentic Cloud, models, and inference platforms. It was also the first time in 17 years since the establishment of Alibaba Cloud that a new product entrance was launched separately outside the Alibaba Cloud official website.
This entrance is very special. When you open the page, there is no prominent product list or console, and no traditional navigation structure. There is only one line on the homepage: "Install Skills npx skills add QianWen - AI/qianwen - ai (One - click installation of the qianwen - ai skill package)".
This is not for humans, but for Agents.
The AI Industry Enters the "Full - Stack War"
If the AI competition in the past two years was essentially a "model war", now the AI industry is entering a more complex competition stage.
When AI really starts to undertake complex tasks, the dimensions of competition will be greatly expanded. The competition will no longer be just about model capabilities, but about: Whether one has cloud infrastructure? Whether one can control inference costs? Whether one has chip capabilities? Whether one can connect to real - world business systems? Whether one has an ecological entrance? Whether one has a continuous data closed - loop?
Ultimately, the competition is no longer about single - point technology, but a full - stack war involving "chips - cloud - models - products (Agents)".
This is why the companies currently at the forefront almost all have highly similar capability structures. Google has TPU, self - built cloud, Gemini models, search, and Android ecosystem; Alibaba has Hanguang chips, Alibaba Cloud, Qianwen models, and a large enterprise ecosystem.
Because the Agent era naturally requires the synergy of chips, cloud, models, tool invocation, business systems, and the ecosystem. Agents rely on computing power, inference costs, and infrastructure far more than traditional chat models.
At the chip level, Alibaba Cloud presented a complete self - developed data center chip matrix covering computing power, network, and storage.
The new generation of All - in - One Training and Inference AI chip, Zhenwu M890, from T-head Semiconductor made its debut. Its specifications are quite impressive: 144GB of video memory, an inter - chip interconnect bandwidth of 800GB/s, and its performance is three times that of the previous generation Zhenwu 810E.
At the same time, Alibaba released the Panjiu AL128 Super Node Server based on the new - generation AI chip Zhenwu M890, equipped with the self - developed interconnect chip ICN Switch 1.0. It can combine 128 AI chips into one computer, with a P2P latency of less than 150ns, aiming to meet the needs of massive concurrent inference and large - model training in the Agent scenario.
T - head Semiconductor also announced the roadmap for the Zhenwu series of chips for the first time: In the next two years, it will successively launch two generations of chips, Zhenwu V900 and Zhenwu J900, with stronger computing power. Currently, the Zhenwu series of AI chips have a cumulative shipment of 560,000 units and have served more than 400 customers in more than 20 industries.
Google proposed a "dual - chip" strategy for the eighth - generation TPU: TPU 8t is specialized in training, and a single Pod can accommodate 9,600 chips, with a total cluster computing power of 121 ExaFLOPS; TPU 8i is specialized in inference, with an 80% improvement in inference cost - performance.
Whether it is a more cost - effective and powerful model or a chip, they actually correspond to the new demands of the Agent era. Only by mastering chips, cloud, models, and the ecosystem at the same time can one truly control the most core variable in the Agent era - inference cost.
The Entrance to the Agent Era
In the past year, there has been a popular view in the large - model industry: AI will reshape the Internet entrance. Search engines, apps, and browsers will be replaced.
However, the choices of Google and Alibaba are very thought - provoking. They did not abandon their original ecosystems. Instead, they began to re - embed AI into their own ecosystems.
Google has integrated Agents into Search, Chrome, Android, Gmail, and YouTube. As early as the beginning of this year, Alibaba fully integrated Qianwen into Alibaba's systems such as Taobao, DingTalk, and Alipay.
In a sense, this is a "return" of the logic of industry giants.
In the past two years, Google has been frantically catching up with OpenAI, and Alibaba has been trying to find its own super AI entrance. But now, both companies seem to have realized again that the real moat has always been in their own hands.
Spark can naturally read Gmail, Docs, and Calendar. These data are already on Google's servers. Users don't need to repeatedly authorize, configure complex interfaces, or build additional workflows.
And this is exactly the most difficult threshold for many independent Agent products to cross. Similarly, Alibaba's renewed focus on Qianwen is also a re - bet on its own ecological capabilities. At the beginning of this year, the Qianwen App launched more than 400 AI service capabilities at once.
The goal is very clear: to transform AI from a "chat tool" into a "service entrance". Because Alibaba has finally realized again that its real advantage lies in the connection capabilities in the real world, such as Taobao, Alipay, DingTalk, local life services, and enterprise services.
Google has taken a deeper step this time, which is to compete for the "protocol layer" in the Agent era. At this I/O Conference, Google launched several protocols, including UCP, AP2, and SynthID.
These things seem very technical, but the underlying business logic is very clear. Google is trying to define the business rules of the Agent era.
The goal of UCP (Universal Commerce Protocol) is to enable Agents to use a unified standard to complete search, price comparison, add to cart, and place orders between different e - commerce platforms; AP2 is the Agent Payments Protocol, which aims to solve the problem of how to confirm that the money spent by an Agent is authorized by you when it spends on your behalf; MCP is the Model Context Protocol proposed by Anthropic, which solves the problem of how AI can call external tools, and Google has chosen to fully integrate MCP this time; finally, there is SynthID, an AI content watermarking technology launched by Google DeepMind. So far, it has embedded digital watermarks in more than 100 billion images and videos and 60,000 years of audio.
Putting these four protocols together, we can see what Google really wants to do: UCP is responsible for "how AI buys things", AP2 is responsible for "how AI pays", MCP is responsible for "how AI calls tools", and SynthID is responsible for "how to make AI content trustworthy".
Together, they form a complete Agent business infrastructure.
In a sense, this is somewhat similar to Google's development of Android back then. First, it defines the standards, and then the entire industry follows.
AI Will Serve AI
If we look back over the past years, Google and Alibaba represent two completely different Internet models.
Google has long been more "technology - driven". Its core capabilities have always been search, operating systems, cloud, basic models, and developer ecosystems. Google believes that as long as it masters the underlying technology and infrastructure, it can define the next - generation Internet. On the other hand, Alibaba is more "business - driven", focusing on transactions, payments, platform operations, local life services, and enterprise services... aiming to make it easy for everyone to do business.
Now, AI, especially Agents, is bringing the two companies back to the same coordinate system.
Even the core keywords at their respective conferences are becoming highly similar: Google is talking about Agents, inference, protocols, tool invocation, and payments; Alibaba is also talking about Agents, inference, workflows, enterprise systems, and AI factories.
In essence, this is the result of the AI industry entering the deep - water zone. As model capabilities gradually converge, all giants will ultimately reach the same destination: to control the infrastructure of the AI era.
In the past, technology giants served "humans". In the future, they may first serve Agents.
Those who really start to use the cloud, call tools, process workflows, connect to business systems, complete transactions, and execute tasks may no longer be humans themselves, but more and more intelligent Agents that are always online and work 24/7.
This article is from the WeChat official account "IT Times" (ID: vittimes), written by Jia Tianrong, and is published by 36Kr with authorization.