HomeArticle

Your AI charges by the word.

36氪的朋友们2026-03-29 09:21
Tokens have become the "currency" of the AI era.

"Tokens have become the 'currency' in the AI era. Whether it's human-AI interaction or AI-AI collaboration, all are accomplished with Tokens as the core medium." During the 2026 Zhongguancun Forum Annual Conference, a relevant technical leader from Moore Threads told China News Service.

With the iteration of computing power infrastructure and the explosion of AI agent applications, "Token", as a unit of measurement for AI information processing, has become a keyword at this year's Zhongguancun Forum.

From Concept to Reality

On March 25th, the National Commission for Terminology in Science and Technology issued an announcement, designating the Chinese name of "Token" as "word element" and releasing it for trial use across the society.

"All large models use Tokens as the unit of measurement. Tokens are the core measurement unit in AI." Lin Songtao, the vice president of THORS Information Technology Co., Ltd., told China News Service. "Just as electricity is billed by kilowatt - hours, Tokens are the 'kilowatt - hours' in AI. Through the consumption of underlying energy such as electricity, it is ultimately converted into Token output."

Since the beginning of this year, AI agents represented by OpenClaw (hereinafter referred to as Lobster) have become extremely popular, and the demand for word elements has expanded rapidly. According to data from the National Data Bureau, at the beginning of 2024, the daily average Token calls in China were 100 billion; by the end of 2025, it soared to 100 trillion; in March this year, it has exceeded 140 trillion, a more than thousand - fold increase in two years.

According to data from the third - party AI model aggregation platform OpenRouter, during the week from March 9th to 15th, 2026, OpenClaw alone contributed 20% of the word element consumption on the platform; its weekly word element consumption scale is equivalent to 60% of the weekly average word element consumption of the entire platform in the fourth quarter of 2025.

"The rise of Lobster has further promoted the transformation of the artificial intelligence computing power usage model. Now, it is mainly based on inference and services, and Tokens have become very standardized." Li Bin, the senior vice president of Sugon, pointed out. This change stems from the transformation of the AI computing power usage model. The computing power infrastructure supporting AI development is gradually changing from a computing power factory to a word element factory.

In the view of Zhou Hongyi, the founder of 360 Group, Tokens are the digital energy in the AI era, the measurement carrier for converting computing power into intelligence, and are at the core of AI infrastructure along with electricity and computing power. He pointed out that computing power is the production basis of Tokens, and inference computing power is the key to supporting Token consumption. The explosion of Tokens will in turn force the upgrade of computing power. "In essence, the competition for Tokens is the competition for computing power, and at a deeper level, it is the competition for electricity and energy efficiency." he said.

Zhou Hongyi told China News Service that the popularity of agent applications such as Lobster is a landmark event for the Token economy to move from concept to reality. "Lobster has educated users on the habit of paying, making Tokens change from an industry technical indicator to a value carrier that can be perceived at the front - end." He judged that the current daily average consumption of 140 trillion Tokens is only the starting point of the explosion and is far from the stage of enterprise - level large - scale application. The violent inference characteristics of L5 - level agents such as Lobster will push Token consumption into an exponential growth channel.

Zhou Hongyi further pointed out that currently, the business logic of AI is being re - constructed, and its business model may shift to the 'pay - as - you - go' Token economy. "From the traffic economy in the Internet era to the Token economy in the Agent era, there is a qualitative change in the underlying logic. The traffic economy is an attention economy with a marginal cost approaching zero and does not create new productivity; the Token economy is a productivity pricing model, supported by computing power, chips, and electricity. The more users there are, the greater the consumption and the higher the cost." he said.

The official website of Volcengine shows that the charging methods for AI audio - video interaction solutions Tokens include charging based on the actual number of Tokens consumed and prepaying for resource packages. Among them, the unit price for charging based on the actual number of Tokens consumed is 12 yuan per million Tokens. The official website of Huawei Cloud shows that the prices vary according to different model versions; when paying for a package, the original price for 1 million Tokens is 2.2 yuan to 5.6 yuan for a one - month period, and the original price for 1 billion Tokens is 2199 yuan to 5598 yuan for a three - month period.

How to Make Money?

When talking about the commercialization of AI, Wang Ai, the chief marketing officer of Honor Embodied Intelligence, believes that Agents are becoming a new productive force. Agents will generate a large number of Tokens during use, and there may be an AI business closed - loop based on Tokens in the future. "Just as water and electricity are billed by kilowatt - hours and tons, AI is billed by Tokens. When calling models with different capabilities, the fees will also be graded." he said.

"Tokens need computing power for output, but now there are more evaluation dimensions and indicators. Originally, the computing power of a computing power system was an evaluation indicator. In the future, how to produce Tokens more economically has become an evaluation indicator." During the 2026 Zhongguancun Forum Annual Conference, Li Bin, the senior vice president of Sugon, told China News Service.

Li Bin said that from the perspective of user experience, the core of Tokens lies in the response speed, that is, whether feedback can be obtained in a short time after a question is asked; from the perspective of computing power operators, it is necessary to consider how many users' concurrent accesses Tokens can support simultaneously and still ensure the basic usage experience under high - concurrency conditions.

Zhou Hongyi believes that there are two paths for monetizing through Tokens. Among them, general Tokens follow a popularization route, relying on massive consumption to achieve small profits but quick turnover, and become a basic service like water and electricity; while Tokens for vertical scenarios and high - value tasks form high profit margins relying on technology and scenario barriers, such as in the fields of security and industry. 'The core is to enhance the value of Tokens, use scale as the foundation, and use technological premium as the increment.'

In the view of the relevant technical leader from Moore Threads, the core of the Token economy is the collaboration efficiency between humans and AI, and between AI and AI. The core business closed - loop lies in the Token output per unit cost.

"We should not only pursue the quantity of Tokens but also pay attention to cost - effectiveness. We need to ensure accurate, fast, stable, and safe calculations, and at the same time, reduce the Token cost to the lowest. This is the key to making computing power usable and good to use." The leader pointed out that as AI agents enter the application era, the growth rate of the demand for inference computing power is much higher than that of training.

It is worth mentioning that the large - scale development of the Token economy cannot be separated from the underlying support of computing power. He Shuibing, the deputy director of Zhijiang Laboratory, pointed out that the expansion of computing power scale does not necessarily mean a synchronous increase in Token output capacity. "Problems such as scheduling bottlenecks, communication and storage performance shortcomings will all affect the efficiency of computing power release and reduce the Token output efficiency per unit of computing power." Taking the H100 million - card cluster as an example, "The annual computing power cost is about 1.2 billion yuan. If there is a 10% loss in computing power utilization, the annual direct economic loss will exceed 120 million yuan."

Xia Lixue, the co - founder and CEO of Unquestionable Core Dome, pointed out at the AI Open - Source Frontier Forum that the current development of AI is still in a long - term continuous promotion process. Its vitality depends on whether a sustainable Token supply system can be built. From the perspective of infrastructure, resources are ultimately limited. From the perspective of a 'Token factory', whether it can continuously, stably, and on a large scale provide Tokens to enable top - tier models to truly serve more downstream scenarios in the long term is a key issue.

Luo Fuli, the person in charge of Xiaomi's large model, pointed out at the AI Open - Source Frontier Forum that due to the rapid progress of large models and the support of the Agent framework, Tokens may see a 100 - fold increase in 2026.

Li Bin believes that with the explosive growth of Token consumption, the demand for computing power will continue to expand. Since OpenClaw, Token consumption has increased exponentially. After the standardization of computing power output, the demand growth is unlimited. Originally, users needed high - threshold access to computing power. Now, with agents as the usage interface and flexible standard - equipped super - nodes, the usage threshold has been further reduced, and there is huge room for future computing power growth.

This article is from the WeChat official account "China News Service" (ID: jwview), written by Zhou Yihang and Xie Jingwen, and is published by 36Kr with authorization.