HomeArticle

Google and Alibaba are collectively shifting: The era of serving people is over.

凤凰网科技2026-05-21 16:32
When AI becomes a productive force, the industry giants are collectively changing their course.

Abstract:

A single line of code instruction has replaced the traditional official website. On May 20th, Alibaba Cloud launched "Qianwen Cloud". The homepage only features an instruction for the Agent to install skills on its own. At the same time, the Google I/O Conference sent out the same signal: the main users of cloud computing are shifting from humans to intelligent agents.

If you open the "Qianwen Cloud" website, the homepage only shows one line: Install Skills npx skills add QianWen - AI/qianwen - ai.

There is no product list, no console entry, and no complex navigation that the mobile Internet has been accustomed to over the past decade. This is the full view of the homepage of "Qianwen Cloud", a brand - new product official website launched by Alibaba Cloud outside its main website for the first time in 17 years - a prompt instruction readable by the Agent, which means asking the intelligent agent to install Qianwen Cloud skills on its own. The users of the cloud are shifting from human engineers to intelligent agents, and Alibaba Cloud has decided to reshape its entire technical system for this purpose.

"Alibaba Cloud is undergoing a full - stack technological innovation, comprehensively upgrading from the underlying chips, Agentic Cloud, models to the inference platform. Alibaba Cloud aims to build the largest AI factory in China." On May 20th, Liu Weiguang, the senior vice - president of Alibaba Cloud Intelligence Group and the president of the Public Cloud Business Unit, announced at the 2026 Alibaba Cloud Summit.

At the same time, Google on the other side of the ocean also presented a similar theme at its annual developer conference. "Google has also been having a meeting these two days, and the theme seems to be the same as ours. We've thought alike," Liu Weiguang replied to the media including Phoenix Tech in a small - scale group interview after the conference.

This is not a coincidence. According to Google's official disclosure, two years ago, the total number of tokens processed by Google's products each month was 9.7 trillion. By last year's I/O Conference, it had increased to about 480 trillion, and this year it has directly jumped to over 3.2 quadrillion per month, a seven - fold increase. Gartner, a global authoritative IT research and consulting firm, has given a more intuitive prediction: by the end of 2026, 40% of enterprise applications will integrate AI Agents - a year ago, this proportion was less than 5%.

Behind the "shared thinking" and the steep growth curve is a smooth switch of the cloud computing growth engine and a forced - accelerated and determined path of self - research.

Reconstruction of the Entrance: When Humans Are No Longer the Main Consumers of the Cloud

"In the future, the main users of cloud computing products will gradually shift from human engineers to Agents." At the beginning of the year, this was a key judgment reached within Alibaba Cloud.

Since its launch in 2009, the interface logic of the Alibaba Cloud official website has remained unchanged: humans log in, browse the menu, find cloud hosts, databases, and storage in the complex product line, and manually configure parameters. However, this path is meaningless for Agents. Agents don't look at web pages or click buttons. What they need is a structured description of capabilities, a clear call protocol, and an expected feedback mechanism.

Liu Weiguang revealed a detail. After the "Lobster" became popular during the Spring Festival, some external customers of Alibaba Cloud also launched products similar to "Lobster". When an Agent like Lobster was born, "there was no need for humans to activate it. Lobster automatically activated the cloud computing resources in the background." In the past, it took human engineers two weeks to complete resource activation, but now it can be done within a day. "Agents are silently and automatically using the cloud."

Based on this observation, Alibaba Cloud made the decision to launch Qianwen Cloud.

Qianwen Cloud is positioned as "a pure official website for AI and for Agents". Its design is more concise, mainly selling models and related applications, and fully Skill - enabled. "It will provide a much better experience for Agents to directly call than using the Alibaba Cloud official website."

In the past, people would first look for databases when entering the website. In the future, Agents will first look for models when entering. The birth of Qianwen Cloud reflects Alibaba Cloud's determination to fully shift towards Agents: when all applications are rewritten by AI and everything is for AI, the priority of the entrance must be reversed.

Token Economy: A 15 - fold Leap from "Enhancement" to "Core Engine"

The change in the entrance is just an appearance. The core driving force behind this reconstruction is the explosive growth of the Token economy.

Among them, the leap in Coding ability has allowed cloud providers to see new service spaces.

"Last year, I said that Token expenditure accounted for less than 1% of enterprises' IT budgets. At that time, AI was just for 'efficiency improvement' and didn't change the essence of the business," Liu Weiguang recalled. "But after the emergence of Coding ability, this is a huge watershed - AI has started to create jobs that humans can't do."

He gave an example: a large number of old applications written in COBOL, C, and Java in the 1970s and 1980s have lost their annotations, and the programmers have retired. However, AI can deconstruct these "code fossils" and move them to the cloud.

"The emergence of AI coding not only generates new applications but also deconstructs old applications with old code, which will bring about a wave of new applications," Liu Weiguang judged.

A greater change comes from the leap in the capabilities of inference models and video models. A customer once spent three months optimizing an open - source model with unique data. However, after a new large - scale model emerged, "it almost completely overshadowed the previous work. Today, the value of large models is much greater than that of using an open - source model with data optimization."

In the video field, he believes that there will be significant changes in the Chinese advertising industry in the future. "Everyone can make advertisements and produce films."

The value leap brought about by AI capabilities is directly reflected in the willingness to pay.

Currently, for AI - native startups, Token expenditure can account for 100% of IT costs; for Internet - related enterprises, it reaches 15% - 20%; for traditional enterprises, it is still below 5%.

At the same time, Liu Weiguang specifically clarified a misunderstanding in market perception. Currently, there is a practice in the market of "forcibly combining video tokens and inference tokens for statistics. However, from a technical principle perspective, these are two different statistical methods. Today, we should 'look at the market space by modality and by model'."

Therefore, Alibaba Cloud has set more comprehensive goals for its sales team, including: the growth and coverage of paying Token customers; whether customers use Tokens to solve urgent needs and access core systems; the efficiency of Agents generated by Alibaba models to complete closed - loop tasks autonomously, etc.

When asked whether the AI business model should be charged by "consumption" or by "results", Liu Weiguang replied to Phoenix Tech: "The ultimate goal must be result - based payment. Currently, most charges are based on quantity, but we can already see the sign that customers are willing to pay for results."

The Hidden Battle between Chips and Models: Why the "Google TPU Path" Must Be Taken

However, the foundation of the Token economy - chips - is facing unprecedented uncertainty.

At this Alibaba Cloud Summit, the roadmap of the Zhenwu series of chips was publicly announced for the first time. The newly unveiled M890 chip has 144GB of video memory and an inter - chip interconnect bandwidth of 800GB/s, with a performance three times that of the previous generation 810E.

Gao Hui, the vice - president of Pingtouge Semiconductor, said at the summit that when an Agent executes a task, it may initiate dozens of model calls in milliseconds, which requires close collaboration between the CPU, GPU, network, and storage. The full - stack self - developed chip matrix is designed to achieve system - level collaboration of computing power, network power, and storage power. Pingtouge also announced the iteration path for the next two years: the V900 and J900 will be launched successively.

When asked why they chose this time to reveal their trump card, Liu Weiguang replied: "The biggest difference from startup chip companies is that we only launched our products into the market after long - term market verification." He said that before the official release, the Zhenwu chips had been widely recognized in the market in fields such as Alibaba Group, Ant Group, intelligent driving, the financial industry, the government affairs industry, and operators.

This almost replicates the classic path of Google's combination of TPU and Gemini. Liu Weiguang did not shy away from his recognition of the "Google path": "Combining our own chips with our own models will definitely achieve the best cost - performance ratio. The combination of Google's TPU and Gemini has achieved the highest performance."

He further gave his judgment: "If in the future, each of our chips can generate more and higher - quality Tokens than our competitors, then we will win."

Notably, just a month ago, Google also launched its self - developed chips - the TPU 8t optimized for pre - training and the TPU 8i optimized for inference to "compete" with NVIDIA. Subsequently, Google's CFO disclosed in the latest earnings conference call that Google's annual capital expenditure is expected to be raised to about $180 - 190 billion.

At the Google I/O 2025 Developer Conference at the same time, Google also showed a complete set of full - stack collaboration cards: from Ironwood to Gemini 2.5, from Vertex AI to the browser - built - in Agent, Google is also taking the closed - loop path of full - stack collaboration from chips - models - inference - Agent entrance.

Sundar Pichai said directly in his keynote speech: "We are in a new stage of the AI platform transformation." What Google wants to do is to "lower the threshold and accelerate creation". He announced that intelligent agents will be fully integrated into Google's main business, Search, and the AI assistant Gemini, and a new AI - powered search mode will be launched through the combination of Gemini and Search.

The fact that two leading cloud providers in China and the United States are betting on self - developed chips and full - stack collaboration at the same time reveals the logical shift in industry competition: The underlying competition in the Token economy has shifted from 'who has more GPUs' to 'who can produce higher - quality Tokens with lower chip costs'.

The effectiveness of this "chip - model synergy" has been demonstrated in practice. The summit revealed that Qwen3.7 - Max, based on just a task description, worked autonomously on the never - before - used M890 chip for 35 hours and independently completed the writing and optimization of a production - level AI computing kernel. The final performance was 10 times higher than the official version.

When asked about the progress of the goal of "capturing 80% of the AI cloud incremental market" proposed at the end of last year, Liu Weiguang gave a more specific figure: "Currently, we have captured more than 20% of the inference market. We haven't lost any accounts with major customers." However, he also admitted that "the emerging market is growing too fast. The revenue in one quarter is even greater than that in the past few years. Looking at the past doesn't make much sense. The key is to win in the future."

According to Alibaba Cloud's statistics, in the past five months, the Token ARR of the Bailian platform's LLM on Alibaba Cloud has increased by 15 times.

Part of Alibaba Cloud's confidence comes from the transformation of existing customers - Coding customers are exactly the existing customers of Alibaba Cloud in the past. Since only developers use the cloud, today 100% of them are Token customers. In addition, the booming Agents are also helping to identify customer directions. For example, the MiniMax Lobster business is based on Alibaba Cloud, and a number of Lobster - like Agents have been born on Alibaba Cloud, which has greatly increased cloud resources.

On the supply side, Liu Weiguang used a set of conversion relationships to emphasize the advantages of cloud providers: "There is a conversion ratio between Tokens and GPUs. Selling a certain amount of Tokens is equivalent to selling a certain amount of GPUs." He gave an example: "If we convert Tokens into GPUs, assuming that 1000 Tokens are equivalent to one GPU card, we will find that the growth brought by one GPU card basically represents a 1:1 growth in CPU. " In other words, using Tokens means consuming both GPUs and CPUs simultaneously, and it has a magnifying effect - 'generating $100 worth of CPU will result in $200 worth of GPU + CPU. This is the combination of cloud and AI.'

This means that the explosion of AI Agents will drive the growth of GPUs, CPUs, and storage simultaneously - as revealed in Alibaba's financial reports, the future business growth space of Alibaba Cloud comes from three directions: public cloud MaaS, private deployment, and the CPU cloud, which is also growing rapidly year - on - year due to the explosion of AI Agents.

Starting from that line of code on Qianwen Cloud aimed at Agents, Alibaba Cloud and Google have both chosen a full - stack reconstruction this May. The change in the entrance is because the users have changed; the explosion of Tokens is because the value has changed; the acceleration of chip development is because the lifeline cannot be in the hands of others. And when each sold Token is simultaneously driving the consumption of GPUs and CPUs, the once - blurred logical chain between the Token economy and cloud infrastructure is being completely rewritten by an industry - wide reconstruction project.

This article is from the WeChat official account "Phoenix Tech". Author: Phoenix Tech. Republished by 36Kr with permission.