Celebrate "No. 1 in AI Cloud", ByteDance and Alibaba Divide the Pie
Text | Deng Yongyi
Editor | Su Jianxun
Cover Source | Photo by 36Kr
In late September in Hangzhou, the fierce competition between Alibaba and ByteDance for the ranking in the "AI cloud market" spread a sense of gunpowder throughout the city.
During Alibaba Cloud's annual event, the "Cloud Computing Conference", many attendees could see that at important transportation hubs such as Hangzhou Xiaoshan Airport and Hangzhou East Station, Volcengine put up large - scale advertisements highlighting the market share figure: "Occupying 46.4% of the public cloud large - model market share in China."
This statement is quite thought - provoking. Although Volcengine didn't use extreme terms like "the best" or "the first", it's well - known that there are about 10 well - ranked cloud providers in China, and Volcengine alone occupies nearly half of the market share. It's like saying nothing and yet saying everything.
Source: Photo by 36Kr
Some industry insiders in the cloud sector questioned the advertisement. They pointed out that Volcengine's figure is based on the MaaS model, mainly in the closed - source form. If a cloud provider's IaaS and PaaS are used to deploy models downloaded from the open - source community, they are not included in this statistics. Judging the competitive landscape solely from the consumption of MaaS - form models is not comprehensive enough.
In fact, not only Volcengine, but now every company can claim to be "the first in the AI cloud market".
In different advertisements and reports from market research institutions, Alibaba, ByteDance, and Baidu have all officially announced themselves as "the first", just with different focuses in their statements, making the situation rather tense.
Chart: Intelligent Emergence
Who is the number one in China's AI cloud market? This has become the hottest topic among cloud providers and AI companies in September.
Some Build Kitchens, Some Deliver Takeaways
First of all, it's a fact that all these claims of being the first in the AI cloud market by model providers are correct. The core of the problem lies not in the authenticity of the data, but in the different ways of dividing the "AI cloud market" cake, that is, different statistical calibers.
So, in which dimensions did each company "win"?
Let's break it down specifically:
ByteDance's Volcengine is the first in the large - model public cloud market call volume (MaaS), which refers to the total call volume of all large models on its public cloud platform. Besides ByteDance's Doubao model (excluding internal business calls from Doubao, Douyin, etc.), there are also a large number of third - party models on Volcengine. As long as the tokens are called on its cloud, they are counted as part of its market share.
Alibaba Cloud's AI cloud includes full - link services from infrastructure (such as GPUs), to PaaS (platform), and then to upper - layer MaaS (model calls), generating the total revenue scale.
What Baidu presents is "the first in the AI public cloud service market", which includes more revenue from product development and customized industry services. Baidu has also disclosed more detailed indicators, such as "ranking first in both the number and amount of large - model winning bids in the first half of 2025".
There is also Zhipu, a large - model provider. After the release of GLM - 4.5, it announced that based on the openrouter statistical caliber, the model call revenue of its model is the sum of that of all other domestic models.
It's understandable for each company to choose its own calculation caliber. This is because cloud computing is a mature business with numerous product modules and diverse procurement and service methods.
For example, a small and medium - sized enterprise may directly rent a virtual server from a cloud provider on a monthly basis, while individual developers often don't need to rent servers for a long time. They can directly register an API and use it as needed. When users start using AI applications, they will request the cloud provider to handle business requirements.
The same principle applies to large models. From individuals, small and medium - sized enterprises to large enterprises, the demands for large models vary.
For example, Volcengine's emphasis on "large - model call volume" is like a systematic and large - scale chain store that only provides takeaway services, delivering a dish (model inference result).
To put it bluntly, Volcengine believes that customers don't need to care about how the kitchen is set up, such as whether it's ByteDance's Doubao model, chips, or inference framework. It only matters that the dish is served quickly and of good quality.
So you can see that Volcengine emphasizes the "speed" of its models in many public statements.
In August, after DeepSeek updated to version V3.1, Volcengine emphasized a word - output interval (TPOT) of 20 - 40ms and an initial concurrent TPM (Tokens per Minute) of 5 million, which was the highest level in the industry at that time.
TPM refers to the number of tokens that can be processed per minute and is an important indicator for measuring model service capabilities. In a medium - usage scenario (such as document summarization and code generation), assuming each user consumes 500 tokens per minute, a TPM of 5 million means it can serve 10,000 users.
This is a more traffic - oriented approach with a fast growth rate. In other words, it's more suitable for the business needs of Internet AI applications with elastic business requirements and small and medium - sized developers.
The AI cloud markets emphasized by Alibaba and Baidu are more about revenue accounting.
For example, the AI cloud defined by Alibaba Cloud covers the revenue generated from the entire system, including the underlying IaaS (infrastructure, such as GPU computing power), PaaS (platform), and upper - layer MaaS (model - as - a - service).
This is like a well - equipped kitchen. Customers can not only have access to high - end kitchenware (underlying computing power) but also directly order dishes (model services), enabling them to do more things.
Why choose the full - stack self - research route?
It's difficult to form a long - term barrier by simply selling model services. An engineer engaged in large - model inference services gave an analogy to "Intelligent Emergence": "For example, if a person goes shopping at a supermarket (full - stack cloud service) every week, when he wants to buy a cup (API), unless the cups elsewhere are outstanding, if there are cups in the supermarket, he will probably buy one there."
Whether it's cloud computing or large models, the real stickiness comes from the overall solution and data binding. Most customers will not only use pure model inference services, such as APIs, which are easy to migrate.
"Mostly, they need a complete product combination of APIs, databases, virtual machines, etc.," said the above - mentioned person.
The Era of Differentiation Is Coming
Another fact to admit is that although the large - model field seems bustling and is growing rapidly, the overall market is still small in essence.
In May 2024, Volcengine completely opened up the MaaS service market by reducing prices. At that time, Volcengine lowered the price of its flagship Doubao model Pro - 32k to 0.0008 yuan per thousand tokens, a reduction of 99.3%.
This directly triggered a price war in the industry, and companies like Alibaba, Tencent, and Baidu followed suit by reducing prices. As a result, Volcengine's call volume soared from 120 billion tokens before the price cut to over 500 billion tokens.
In the first half of 2025, IDC statistics showed that the total large - model call volume in the country was 536.7 trillion tokens. If calculated roughly based on the price of Doubao Pro 128k right after the price cut last year (0.0005 yuan per thousand tokens), the total size of the MaaS market in 2025 is about 500 - 600 million yuan.
In contrast, Alibaba Cloud's annual revenue in 2024 alone exceeded 80 billion yuan. Compared with the traditional cloud - computing market, the current MaaS market is not comparable.
The problem is what each company believes in and is willing to invest in at this stage, which leads to different market strategies.
Cloud computing is a large - scale business with a significant Matthew effect. Relying on the advantages of traditional cloud services, Alibaba Cloud naturally tends to provide full - stack solutions and add model services to attract customers.
As a new entrant, it's more difficult for Volcengine to compete in the existing cloud - computing market (such as CPU). Therefore, it focuses on the future growth of GPU and MaaS (model inference services).
In a public report in May, Volcengine CEO Tan Dai said that "the marathon has just run 500 meters", and the future market space will at least expand 100 times. For example, according to the data recently released by Volcengine, the daily token usage of the Doubao large model exceeds 16.4 trillion, which is 137 times higher than when it was first released in May last year.
There is no unified standard for determining who is the number one in the AI market at present.
At the Cloud Computing Conference, Alibaba Cloud directly responded to the view on the statistical caliber: If you only look at the call volume of large models on the public cloud, it's like only seeing the tip of the iceberg.
The reason is that a large number of customers don't directly call models on the public cloud. Alibaba Cloud's Tongyi large - model family follows an open - source route, and many enterprises directly download the models and deploy them on their own private clouds or local servers.
The call volume of this part of self - deployed and used models by customers is very large, but it can't be counted in external data reports at all.
Alibaba Cloud measures the total revenue of the entire "AI cloud" business. This scope is very broad, including all relevant revenues such as renting GPU computing power (infrastructure), providing platform services (PaaS), and model call services (MaaS). According to this "total revenue" dimension, Alibaba Cloud is the first.
To prove the stronger influence of its own large models, there are actually many detailed dimensions to explore.
In an interview after the Cloud Computing Conference, Alibaba Cloud also proposed that a distinction should be made between whether the models sold by a company are self - developed or resold from other companies, and whether the model services provided handle complex tasks (which can only be completed by high - quality flagship models) or some simple offline labeling work?
A Sullivan report in September reflected this idea of emphasizing self - research ability. According to the daily average call token share of each company's "self - developed models", the data showed that in the first half of 2025, Alibaba's Tongyi accounted for 17.7%, ByteDance's Doubao accounted for 14.1%, and DeepSeek accounted for 10.3%.
The price war is difficult to sustain in the long run. Large models are transitioning from competing in single indicators such as infrastructure scale and token call volume to competing in efficiency and service depth.
A typical sign is that in 2024, simply reducing prices could still trigger an explosion in the token market, but in 2025, it's no longer a panacea. Since tokens are already cheap enough, enterprises pay more attention to model effects and efficiency.
For smaller AI model providers, they don't have the resources to follow the full - stack route of large companies. Therefore, differentiation will become extremely important.
A recent example is that in 2025, the new model K2 launched by Dark Side of the Moon was deeply optimized for code tasks, with high cost - effectiveness, and its call volume soared instantly. Last week, Zhipu also launched a GLM Coding package for programming scenarios and increased the concurrency of programming services.
Business competition is quite straightforward. An unbreakable truth is that when lagging behind in the indicators that opponents are good at, it's better to re - define an indicator based on one's own strengths. And the most important question in the current large - model field is how many customers are willing to pay.
Welcome to Communicate
This article is from the WeChat official account "36Kr Pro". It is published by 36Kr with authorization.