Ali und Google, Agenten marschieren in Formation!
Open the new website of Alibaba Cloud's AI product. First, you'll see a line of installation instructions for skills. This is an instruction readable by agents. After the call, your AI agent can use the native call ability of the Tongyi Qianwen large model with a single click.
This is the first time in 17 years that Alibaba Cloud has created an independent product website outside the main website. However, it didn't create a website for users like usual Internet products. The first screen is specially designed for agents. The logic is: If your users are AIs, you don't need a banner, but an executable instruction.
The underlying logic is not hard to understand. In the agent era, the service object is not only humans, but also autonomous decision - making and functioning agents. Alibaba Cloud has clearly transformed cloud products into skill components, MCP standards, and CLI instructions, so that each cloud product becomes a standardized capability module that can be directly called by agents like a program function.
Meanwhile, on the other side of the ocean, an agent transformation is also taking place. Google announced updates to its entire technology and product portfolio from chips to models to applications at the I/O conference. The two major Chinese and American AI companies have once again invested in the same direction on the same track.
Especially at the application level, Google's latest Antigravity 2.0 platform is the central environment for the development and management of autonomous AI agent clusters. It can independently write a complete operating system in 12 hours and focuses on core agent conversations, artifacts generated by agents, and the orchestration of multiple agents. "We're making Antigravity the only platform you need for agent - first development."
Agent - first, agents do everything. Similar development trends are emerging at Alibaba Cloud and Google simultaneously.
This summer, Google started introducing an intelligent shopping cart that makes it easier for users to shop while browsing websites or chatting with Gemini. At the same time, it automatically searches for discounts and monitors price drops. With the help of underlying structures like the Universal Commerce Protocol (UCP), Google has also involved large companies like Amazon, Meta, and Microsoft. This cross - platform shopping experience will surely make the shopping cart in the agent era even smarter.
How should Alibaba respond to Google's full - scale operations in the shopping area in the agent era, and how should it establish industry - standard rules in the field of agent shopping?
Alibaba Cloud will be more open
Previously, in "Google Cloud gives Alibaba Cloud a lesson", it was mentioned that it's important for cloud providers to win the competition. However, it's even more important that the competition takes place within their ecosystem. Model freedom is one of the important advantages of Google Cloud. This is also an important lesson that Google has taught Alibaba Cloud.
Now, Alibaba Cloud is learning from Google's strengths and aiming for the "most open cloud in the AI era". As an enterprise platform for large - model application development, Alibaba Cloud Bailian has also started integrating third - party models.
In addition to Alibaba's self - developed Qianwen model matrix, third - party models such as Zhipu GLM - 5.1, MiniMax M2.7, Yuezhianmian Kimi K2.6, Keling, and Vidu Q3 are also integrated into the Bailian platform.
On the official website of Qianwen Cloud, there are already over 150 model series and over 480 different models available, covering mainstream models in China and abroad. Multiple models can be compared simultaneously. Developers can quickly test, evaluate, and select models according to their needs.
At the same time, the core capabilities of the model service of Qianwen Cloud are packaged into skills and CLI tools. This means that agent tools like OpenClaw can learn all the capabilities of the entire platform with just one instruction and plan autonomously. For image tasks, the visual model is called; for image generation tasks, the image generation model is called; for video tasks, the video model is called. The whole process requires no human intervention and no integration code writing.
For cloud provider customers, the transparent use of token resources is a very real problem.
Qianwen Cloud's solution is an intelligent and transparent management system. Agent A can retrieve model usage data in real - time, analyze data trends, detect abnormal usage patterns, and optimize costs. At the same time, data such as logs and key activities can be retrieved via the CLI to detect abnormal activities and track tasks.
This is also a common trend for Alibaba Cloud and Google Cloud. They are not just selling models, but becoming AI factories that provide computing power and infrastructure.
Google's strength is its global developer density, while Alibaba's strength is the depth of its local ecosystem.
Google announced at the I/O 2026 conference that the number of tokens processed per minute via the API has already reached 19 billion. 8.5 million developers use Google's AI models for app development every month. Inside Google, over 3 trillion tokens are processed with AI development tools every day, and this number doubles every few weeks.
These are not only data on the computing power of models, but also core data on infrastructure capacity.
So, it's not surprising that the price of Gemini 3.5 Flash has tripled. According to Google's calculations, although the model is more expensive, it's more efficient and can save companies over $1 billion in AI costs annually.
Google doesn't sell cheap products. It wants every dollar to be used for a higher processing volume. This is completely different from the traditional logic of price cuts. In the past, the market was captured by price cuts, which was in line with the logic of entry fees. Now, the price is increased, but the efficiency is improved. Those who can produce higher - quality tokens with lower chip costs follow the logic of infrastructure.
The logic of infrastructure means that when an agent needs to call language capabilities, the agent first decides which capabilities and which call path to choose. This is the real goal of all the technological announcements at the two summits of Alibaba and Google.
Google is still Alibaba's teacher
Google presented many things at the I/O 2026 conference, from the Gemini Omni world model at the model level to the first built - in Gemini audio smart glasses at the hardware level, which are based on the Android XR platform. It can be said that agents are fully integrated into all of Google's business areas and have built their own ecosystem in scenarios such as search, office work, and shopping. This makes it difficult for all competitors to overtake Google.
Even more important is the underlying investment. Google has planned an annual capital expenditure between $180 and $190 billion this year, with the majority invested in customized chips.
Google has already introduced the TPU 8t optimized for prediction optimization and the TPU 8i optimized for inference optimization. This shows that chips are at a crossroads. The direction is further subdivided. Training requires extreme computing power and massive parallelism, while inference requires extremely low latency and high memory bandwidth. There is a fundamental design tension between these two goals. Trying to achieve both on a single chip results in neither direction being optimal.
Alibaba's newly introduced Zhenwu M890 has a built - in 144 GB HBM graphics memory and an inter - chip connection bandwidth of 800 GB/s. The overall performance is three times that of its predecessor, the Zhenwu 810E.
128 chips form the Panjiu AL128 super - node, where the P2P latency is below 150 nanoseconds. Gao Hui, the vice - president of PingTouGe, defines the goal of this chip as follows: When an agent performs tasks, it can call multiple models within milliseconds. This requires close cooperation between the CPU, GPU, network, and memory, not just an increase in computing power.
The Zhenwu M890 is a design for both training and inference in one. This is in sharp contrast to Google's approach, which separates training and inference. The two options are based on different assessments of the current main bottlenecks.
In the integrated development of chip functions, Alibaba PingTouGe stands together with NVIDIA and Baidu Kunlunxin. Google TPU and Huawei Ascend, on the other hand, belong to the "total differentiation school". These technological direction differences are an inevitable trend after the scaling of computing power. The decision of whether companies offer customers a simple and cost - effective all - in - one solution or a solution with a clear division of labor is a response to different market needs.
Alibaba and Google are taking different paths in the field of chip development. Interestingly, the planned introduction of Google's eighth TPU and Alibaba's Zhenwu V900 chip are similarly scheduled, both targeting the end of 2027.
This could be a common bet. The next main battlefield in the competition for AI performance is not who has the model with the largest number of parameters, but who can best respond to market needs and produce high - quality tokens with the lowest energy consumption.
From the perspective of chip development, Google is still Alibaba's teacher. Liu Weiguang, the senior vice - president of Alibaba Cloud Intelligence Group and the president of the Public Cloud Business Unit, believes that the combination of Google TPU and Gemini achieves the highest performance. The underlying logic is that a self - developed chip and a self - developed model can always achieve the best cost - efficiency.
Who will change intelligent shopping?
What Alibaba should most urgently anticipate and plan for is Google's newly introduced Universal Shopping Cart function, which targets e - commerce consumption in the agent era.
This is a new AI scenario developed by Google, called the "Universal Cart". Users can add items at any time during searches, on YouTube, and in Gmail. The shopping cart automatically searches for discounts in the background, monitors price drops, and gives reminders for restocking. Then, payment can be made with Google Wallet, and it automatically calculates which payment card is the cheapest. Even if users don't pay with Google, they can also return to the merchant's website and complete the order there.
Google wants to create a one - stop shopping website. Google acts as an "intermediary" in users' shopping and currently charges no commissions.
Even more important is that Google's underlying Universal Commerce Protocol (UCP) and the AP2 protocol for payment security are building a new e - commerce regulation. This is what all e - commerce industries should anticipate.
The UCP can be understood as an open standard protocol for artificial intelligence in shopping. From product search, adding to the shopping cart, purchasing, payment to customer service, these rules are initiated by large retailers such as Google, Walmart, Shopify, and Target. In April, companies such as Amazon, Microsoft, and Meta also joined this open standard.
This means that in the future, it won't be humans placing orders on e - commerce websites, but individual agents. They compare prices and place orders on behalf of humans and can operate on many shopping websites, not just a specific one.
This is in sharp contrast to e - commerce agents in the Chinese market. Doubao can place orders on the TikTok e - commerce website, and the Qianwen app can be connected to Taobao to place orders. But neither can make cross - platform purchases. So, the capabilities of each agent are limited to its own area.
Google wants to expand this intelligent shopping experience to a larger market. The agent shopping experience of the "Universal Cart" will be launched in Google Search and Gemini this summer. The UCP shopping experience will be available in Canada, Australia, and the UK in the next few months and will gradually expand to industries such as hotel bookings and local food deliveries.
In addition, Google's AP2 protocol is an underlying regulation for ensuring intelligent shopping. It is designed to ensure that agents can pay safely for users within the set limits.
The AP2 protocol is based on a transparent and verifiable connection establishment between users, merchants, and payment processors. Throughout the process, encryption technology is used to protect user data. The protocol also contains immutable digital records to ensure that agents always act in the interests of users and provide a permanent audit trail for buyers and sellers in case of returns or disputes.
This means that agent shopping must meet certain conditions, including specifying the desired brands and products and the spending limit. When these conditions are met, the purchase is automatically completed.
A2A covers communication between agents, UCP covers agents' business activities, and AP2 covers agents' payment authorization. With these three levels, Google is writing not just a product but a fundamental rule for cross - platform shopping sales in the agent era. This is also an industry - wide trend that Chinese e - commerce giants need to anticipate early:
The battlefield is no longer on which platform users shop, but with which agent users place orders.
For Chinese users, consumption habits and trust in e - commerce platforms will not change in the short term. However, in the long run, the logic of the "shopping entry" will change when Google's initiated intelligent shopping protocol matures globally.
E - commerce platforms such as Alibaba, JD.com, and Pinduoduo must decide whether to build their own regulation system or adapt to this global standard protocol.