Will the ultimate business model of AI be a "gym"?
Doubao Pokes Through an Illusion
A late report on Doubao, Seedance, and AI commercialization has brought an increasingly unavoidable question in the industry to the forefront: How exactly does AI make money?
As of the first half of the year, Doubao, which is used by over 200 million people every day, has a daily revenue of less than one million yuan, mainly from e - commerce commissions. By May this year, the computing power cost consumed by the Doubao app every day might have reached tens of millions of yuan. Text chat itself isn't very expensive, but once it comes to multi - modal functions such as reasoning, image recognition, voice chat, and video chat, the cost will rise sharply.
This doesn't even account for the investment in intelligent computing centers required for training models. A large - scale intelligent computing center often requires tens of thousands of AI chips, along with supporting power supply, network, cooling, operation and maintenance, and data center infrastructure. That is to say, AI is more like a mixture of software, cloud computing, electricity, semiconductors, and heavy - asset manufacturing.
Similar changes have also occurred in other big tech companies.
Tencent has promoted intelligent agent development platforms such as WorkBuddy's enterprise version and government version to a more important position. The strategic level of Yuanbao has instead declined. Compared with a chat entry, these products are closer to enterprise productivity tools, developer tools, and MaaS platforms, targeting B - end customers with budgets, organized processes, and clear efficiency requirements.
Microsoft is also recalculating the AI ledger. In the past, Copilot was more like a standardized subscription product, but when enterprise agents start to continuously call models, execute tasks, and consume reasoning resources, the "fixed price per person per month" model begins to struggle. Microsoft has promoted pay - as - you - go billing in some Copilot and Agent services, allowing enterprises to pay based on actual usage and control bills through budget and cost management.
Anthropic has taken a more direct approach. The usage - based enterprise plan of Claude Enterprise has shifted from a simple subscription to a hybrid model of "seat fee + usage fee": enterprises first pay for user seats, and the actual model usage is billed separately by tokens.
This is also the biggest difference between this wave of AI and the mobile Internet.
In the past, when developing apps, one could first offer them for free, focus on DAU, user time, and occupy entry points, and then gradually monetize through advertising, e - commerce, memberships, games, finance, and lifestyle services. Toutiao, Douyin, Xiaohongshu, and Kuaishou are all products of this logic. The marginal cost of content distribution is relatively low. If users spend one more hour swiping, the platform doesn't proportionally consume more high - priced GPUs.
But AI is different. AI is a product that consumes computing power with every interaction. The more active the users are, the more real the cost is; the longer the context is, the more tense the video memory becomes; the more complex the output is, the longer the GPU is occupied. Once it comes to images, voice, video, and agents, the cost structure is more like industrial production rather than Internet traffic distribution.
This means that the first - principle of AI is not "traffic" but "computing power".
Doubao's problem is not the lack of users. On the contrary, its problem is that there are too many users, but commercialization hasn't kept up. A daily active user base of 200 million indicates real demand, but demand doesn't equal revenue, and revenue doesn't equal profit. If a large number of users chat for free, generate content for free, and call multi - modal capabilities for free, and the platform can't convert this usage into high - enough revenue, then scale itself will become a cost burden.
This is the so - called illusion of the "mobile - Internet - style AI narrative": in the past, we believed that users came first, and then came commercialization. But in the AI era, user growth and cost growth are highly correlated. An AI product can't just focus on DAU, user time, and download volume. It must also answer a more fundamental question: How much does each call cost? Who will ultimately bear these costs?
From this perspective, the difference between Doubao and Seedance becomes very clear.
Doubao is a general - purpose AI assistant for the general public, with a large user base, but the reasons for users to pay are not strong enough. Ordinary users will surely find AI useful for asking questions, writing, chatting, looking up information, and generating images. However, these values are fragmented and difficult to turn into monthly fees stably. Especially in the Chinese market, users have been accustomed to free content, free novels, free videos, free meeting software, and free tools for many years. It's inherently difficult to get the general public to pay continuously for "smarter digital services".
Seedance, on the other hand, targets producers, such as short - drama companies, comic - drama companies, advertising companies, and content production teams. It doesn't ask ordinary users to pay for "fun" but helps industries with existing budgets reduce costs and improve efficiency. In the past, producing a video required people to draw storyboards, make animations, and do post - production. Now, AI can take over part of the production process. Customers calculate very straightforwardly: if the videos generated by AI are usable enough, cheaper than human labor, and faster than the old processes, then it's worth paying for.
Therefore, the key to AI commercialization is not whether it's the C - end or the B - end, but whether there are clear reasons for users to pay.
Why is it easier to charge for AI programming? Because it directly targets programmers, R & D teams, and software companies. AI can shorten development time, increase code output, and replace some repetitive labor. Enterprises pay not for "chatting" but for faster software delivery.
Why does AI video have a chance? Because it is directly embedded in the content production budget. Short - drama companies, advertising companies, game companies, and film and television teams originally have to spend money on production capabilities. As long as AI can reduce costs, it can take a part of the budget.
The same logic applies to AI customer service, AI legal services, AI investment research, AI design, AI sales leads, and AI data analysis.
Behind Every Conversation, There's a Electricity Bill
This also explains why the companies that seem to be making the most money now are the ones "selling shovels".
Chips, clouds, data centers, electricity, cooling, and networks are the first beneficiaries in the AI era. Whether it's OpenAI, Anthropic, Google, ByteDance, Alibaba, or some new application companies that ultimately win, they all need to train models, deploy reasoning, and purchase or rent computing power. The "shovel - sellers" are at the top of the value chain. They don't need to judge who will strike gold. As long as people keep digging, there will be buyers for shovels.
This is where NVIDIA and cloud providers are the strongest.
However, if we conclude that "only the shovel - sellers will make money in the end", it might be too pessimistic. A more accurate statement is: the shovel - sellers make money first, the basic model layer is highly concentrated, a large number of downstream applications will fail, but the applications that are truly embedded in the workflow and master the payment scenarios still have a chance.
The ultimate business model of AI may not be a single model but a combination of three models.
The first is the "gym - membership model" for the C - end.
The core of the gym - membership system is that most people pay but don't go often, and a few high - frequency users are subsidized by low - frequency users. AI subscriptions follow a similar logic. Light users ask a few questions and generate a few images each month, and the platform makes money; heavy users write code every day, run agents, make videos, and read long documents, and the platform loses money.
The ideal users for AI subscriptions are those who "are willing to pay for capabilities but won't use up their quota".
This is exactly the same as the users that gyms like the most: they buy an annual membership, come a few times occasionally, and feel that they have a healthy lifestyle.
The problem is that the most valuable users of AI are often the ones with the highest frequency and the highest cost. Programmers, designers, short - drama companies, investment researchers, and content teams, the more they find AI useful, the more intensively they will use it. So the platform can no longer rely on the "gym - style membership" and must switch to pay - by - usage or pay - by - result billing.
Therefore, what AI subscriptions really bet on is that users are willing to pay but won't over - use.
This is also why pure unlimited subscriptions are difficult to sustain in the long run. The marginal cost of AI is very clear. Every token, every image, every second of video, and every in - depth research can be converted into GPU time, electricity, video memory, scheduling, and depreciation. If a large number of heavy users flood in, the subscription model will be broken.
Subscription Isn't the Endgame; Quota Is the Ledger
Therefore, in the future, C - end AI is more likely to become a combination of "membership + quota + over - quota package". Ordinary chat is almost unlimited, but there are limits on the number of times for using advanced models. Image and video generation use points, in - depth research is charged by the number of times, and code agents are charged by the task volume. What users see are memberships, points, creation quotas, and the number of in - depth research times; what the platform calculates internally are tokens, GPU seconds, reasoning costs, and unit gross profit.
The second is the "cloud - service model" for the B - end.
The business model of cloud services is essentially that cloud providers first invest heavily in building data centers, servers, chips, networks, and basic software, and then divide these resources into standardized capabilities and rent them to enterprises on demand. Enterprises don't need to build their own computer rooms, buy servers, or hire operation and maintenance staff. Instead, they pay for computing, storage, databases, bandwidth, and API calls.
B - end AI is very similar to cloud services. Model APIs, MaaS platforms, enterprise agents, knowledge bases, AI programming, and AI video generation are essentially all about turning "intelligent capabilities" into a measurable resource. How many tokens, how much context, how much image recognition, how much voice transcription, how many seconds of video, and how much agent execution time an enterprise uses correspond to how much cost.
But AI is more complex than traditional clouds. Cloud services sell resources, while AI is best at selling results.
Enterprise customers care about whether customer - service costs have decreased, whether code delivery has become faster, whether there are more advertising materials, whether video production has become cheaper, whether investment research reports have become more efficient, and whether legal reviews have reduced the need for human labor.
Therefore, the best B - end AI business model is to settle accounts by resources in the background like cloud services and charge by value in the foreground like SaaS or industry tools.
Enterprises Pay for Results
What is told to customers is: I've handled a thousand customer - service conversations, generated a hundred advertising materials, completed a piece of runnable code, and finished an investment research report.
What is calculated within the company is: How many tokens, how many GPU seconds, how many failed retries, and how much engineering scheduling cost these tasks have consumed.
The third is the "pay - by - result model".
Ultimately, AI doesn't sell models or tokens but sell verifiable business results.
If an AI application just forwards user requests to upstream models, it is essentially helping the upstream sell tokens and it's difficult for itself to make a profit. The real profitable downstream must encapsulate tokens into workflows, turn computing power into results, and turn results into bills.
AI programming tools like Codex, Claude code, and Cursor have reconstructed the interface for developers to write code. Microsoft has embedded Copilot in Office, and ByteDance has embedded AI in advertising placement, video editing, short - drama production, and e - commerce merchant tools. They are also selling production tools.
This is the real opportunity in the AI application layer: not to create another "I can also chat" app, but to become part of an industry's workflow.
Here we need to return to tokens.
What are tokens? They are not the GPU itself but the basic measurement unit for the model to process information. Input tokens are the content read by the model, and output tokens are the content generated by the model. Generally, the more tokens there are, the more the model has to read, calculate, and generate, which will ultimately be converted into GPU computing volume, video - memory occupation, reasoning time, electricity, cooling, and system - scheduling costs.
Tokens are like the electricity meters in the AI era. What users see are questions and answers, images, videos, and code, while what the platform's back - end sees are tokens, GPU seconds, and unit - task gross profit.
The GPU is like a power plant and factory equipment. It's not gradually "worn out" by tokens, but long - term high - load operation will lead to power consumption, heat loss, video - memory pressure, hardware aging, and accounting depreciation.
More importantly, the lifespan of an AI GPU is not just its physical lifespan but its economic lifespan. Even if the card is not broken, if a new - generation chip has stronger performance and lower energy consumption, and the unit - token cost of the old card is too high, it will be forced to be used for low - end tasks or even be economically scrapped.