HomeArticle

New way of involution in big tech companies' AI: Tokens become the new "PPT"

强调Next2026-05-25 09:56
The "Token Legend" has transformed into a new-era PPT. Employees are busy having agents handle big data, while bosses are footing the bill for ineffective cycles.

The money saved by enterprises through cost reduction and efficiency improvement may be being burned by ineffective Tokens.

It is reported that Microsoft has begun to scale back the internal authorization of Claude Code. According to The Verge, Microsoft's Experiences + Devices team will shut down most of the third - party Claude Code authorization seats by the end of June and fully switch to its own GitHub Copilot CLI. One of the core demands is to relieve the pressure of AI costs. Uber's situation is even more severe. Its CTO, Praveen Neppalli Naga, publicly admitted that the company's annual AI budget for 2026 was basically exhausted in just four months.

Meta is taking a different path, and the direction is completely opposite. Meta has launched a Token consumption leaderboard internally, awarding honors such as "Token Legend" and "Cache Magician" to high - usage employees. It even links usage with performance evaluations and implements the last - place elimination system. Just 30 days after this mechanism was implemented, the total Token consumption of all Meta employees soared from 6 trillion to 73.7 trillion, an increase of more than 12 times, and AI consumption got completely out of control.

Some companies are hitting the brakes, while others are stepping on the accelerator, but they are facing the same problem. The industry still doesn't have a mature, stable, and implementable AI value measurement standard. Therefore, the easily - statistical Token consumption has become the only hard indicator.

Robin Li, the CEO of Baidu, pointed out at a conference two weeks ago that while Tokens are easy to count, they do not equal actual output.

If an employee runs more Agents, inserts a longer context, and lets the model make repeated trial - and - error attempts, the bill will quickly increase, but the business results may not improve synchronously.

01.

Token KPI: An Assessment Experiment That Creates Waste

The work mode aiming at maximizing Token consumption ("Tokenmaxxing") began to spread in Silicon Valley at the end of last year and has now spread to China. The technology teams of large companies such as Alibaba, Tencent, and ByteDance have, to varying degrees, included Token usage in the reference for regular employment and promotion.

When performance evaluation is deeply bound to Token usage, workplace formalism has quickly spread to the AI work scenario. According to Caijing magazine, many employees deliberately let AI Agents read tens of thousands of lines of code in batches and pile up tens of thousands of words of literature data to meet the standards, simply "piling up work volume" to boost Token consumption without any actual work output. This is not an isolated case. Public data shows that nearly half of the Token consumption in global enterprise - level AI applications is ineffective waste.

How much of Meta's 73.7 trillion Tokens has truly been converted into effective output? This is precisely the core flaw of all Token KPI systems.

Different from the cost anxiety of Silicon Valley enterprises, top domestic large companies are using high - value Token subsidies to fully lower the threshold for employees to use AI.

Judging from the information disclosed through various channels, each company's welfare policies have different focuses: Tencent has provided a special annual Token package worth 228,000 yuan for core R & D personnel, plus a monthly reimbursement of $1000 for external tools; ByteDance has opened its AI tools for unlimited internal use. Employees can get 50% reimbursement for their off - work AI experiences, with a limit of $1000 per year for technical positions; Baidu has provided technical positions with unlimited access to Wenxin Yiyan, plus a maximum annual external Token subsidy of $800; 360 has directly credited 100 million Tokens to all employees.

AI tools are no longer just office software plugins; they are becoming new means of production. In the past, enterprises provided employees with computers, software accounts, cloud storage, and reimbursement limits; now, R & D, design, product, and operation personnel may all need model call limits. Especially in scenarios such as code generation, Agent workflows, video generation, and knowledge retrieval, Tokens are the fuel for work.

The problem is that many companies haven't figured out how to calculate the "fuel consumption" after distributing the "fuel".

02.

The Money - Guzzling Agents and the Unclear Variable Accounts

Uber's internal data accurately exposes the core mechanism flaw in the out - of - control AI costs of enterprises. Currently, 95% of its engineers use AI coding tools regularly, with the monthly AI call cost per person ranging from $500 to $2000. 70% of code submissions are generated by AI. AI Agents can complete 1800 code changes per week, and the proportion of related workload has risen from less than 1% to 8%.

From the perspective of business implementation, this represents a significant increase in AI penetration. However, for the enterprise's finance department, it means that the rigid and controllable IT cost system has been completely broken.

The root cause of the cost out - of - control lies in the high - consumption characteristics of AI Agents. Gartner points out that to complete the same amount of tasks, the Token consumption of Agents is 5 to 30 times that of traditional chatbots. Goldman Sachs even predicts that by 2030, the global Token consumption will reach about 120 quadrillion per month, 24 times the level in 2026, driven by the large - scale deployment of enterprise - level Agents.

Traditional SaaS is billed by seat, and the IT department can lock in the annual expenditure ceiling when making purchases.

The cost structure of AI tools is fundamentally incompatible with this. The Token bill grows dynamically with usage behavior. The finance department lacks historical data to establish a benchmark, the IT department doesn't have mature tools for real - time tracking and cost allocation, and the business department doesn't establish a cost attribution mechanism when promoting usage.

It's not that AI is useless; it's that the enterprise's FinOps system hasn't kept up with the consumption speed of AI. So Microsoft and Uber have urgently hit the brakes.

03.

Employees Are Taking Advantage, and the Business Is Making Promises

The current situation of domestic companies is slightly different from that in Silicon Valley. The anxiety in Silicon Valley is that the usage is growing too fast and the bills are over budget; while the more realistic embarrassment for domestic large companies is that money has been spent, but employees are not using AI deeply enough, and the actual business value is unclear.

From the C - end data, the popularity of domestic AI applications is unprecedentedly high. According to the 2026 industry report of QbitAI, in April this year, the monthly web - based access volume of domestic AI applications exceeded 900 million, the monthly APP downloads exceeded 240 million, and the daily active users reached 670 million, a year - on - year increase of 223%. QuestMobile data also confirms that as of March 2026, the monthly active users of domestic AI - native APPs reached 440 million, and Doubao, Qianwen, and DeepSeek ranked among the top three in the industry.

The hot C - end data has not been synchronously converted into productivity growth at the enterprise level.

According to the data from Accenture's "2025 China Enterprise Digital Transformation Index", 46% of domestic enterprises have launched AI adaptation and implementation, but only 9% of them have achieved significant business value breakthroughs. The AI implementation of most enterprises still remains in the primary stage of shallow - level trials, blind promotion, and exploration of scenarios.

In shallow - level scenarios such as content creation, customer service Q&A, and code assistance, AI implementation has a low threshold and quick results. However, in key business links such as core R & D, supply chain management, financial risk control, and organizational collaboration, the adaptation difficulty, compliance threshold, and implementation cost of AI implementation have all increased significantly.

The purpose of large companies distributing Token subsidies to all employees is to reduce the AI trial - and - error cost through financial incentives and force all employees to integrate into the AI workflow. This approach has a certain rationality: only with sufficient usage density can enterprises be forced to screen out real scenarios suitable for the business and at the same time cultivate employees' AI usage habits.

However, the problem is that if only usage is encouraged without establishing a value measurement system, the welfare will turn into a bill pressure, and at the same time, pseudo - AI work modes such as "competing in PPTs" and "competing in documents" will emerge.

The penetration speed of AI in the workplace has far exceeded expectations. According to a 2026 report by Cognizant, 93% of jobs in the United States will be affected by AI to varying degrees. This industry prediction is six years earlier than previously estimated.

The AI penetration of major positions is showing a full - scale outbreak trend. Data shows that in 2023, the AI exposure of management, financial operations, and administrative support positions was only 14% - 21%, but now it has soared to 60% - 68%; the AI exposure of lawyer positions has jumped from 9% to 63%, and even the theoretical AI exposure of CEO positions has exceeded 60%. The report also emphasizes that theoretical penetration does not mean actual replacement. Accountability, industry supervision, and human subjective judgment are still the core barriers restricting the full - scale implementation of AI.

This means that AI will continue to enter more positions, and Token consumption will also spread from the R & D department to a broader organizational level. What enterprises really need to face is how to judge whether a Token expenditure is worthwhile.

04.

Squeezing Out the Token Bubble: Shifting from "Usage Worship" to "Efficiency Measurement"

Tokens themselves are not the problem. For enterprises to build a mature AI productivity system, sufficient Token investment is necessary as support. The core crux of the industry chaos has never been "using too many Tokens", but "treating Token usage as the only goal".

Meta's Token leaderboard mechanism seems to have activated the enthusiasm of all employees to use AI and promoted employees to try new AI tools to a certain extent, but it cannot avoid the core flaw: the total Token consumption has no direct relationship with employees' business output.

The cost crises of Microsoft and Uber also confirm that simply cutting Token quotas across the board only treats the symptoms, not the root cause, and may even harm truly efficient AI office scenarios.

Robin Li tried to provide an answer. He proposed the concept of DAA: Daily Active Agent. He advocates using the number of daily active Agents to measure the actual penetration degree of AI, rather than the total Token consumption. This direction makes sense, but the specific calculation method also needs to be improved. After all, an active Agent may not necessarily run through the business process.

The core transformation direction for enterprises is to abandon Token worship and establish an AI efficiency mindset.

When evaluating the quality of R & D AI work, the focus should be on the merge pass rate, defect rate, rework rate, and project delivery cycle of AI code, rather than the call frequency;

When evaluating the customer service scenario, the core should be on the one - time problem - solving rate, manual takeover rate, and user satisfaction;

In the marketing content scenario, the emphasis should be on output efficiency, conversion effect, and compliance risk control;

For AI Agent workflows, it is necessary to focus on investigating wasteful behaviors such as ineffective retries, redundant contexts, and unreasonable model calls.

The core logic of this refined cost and value management system is to accurately distinguish between effective AI calls and ineffective resource consumption and cut off wasteful behaviors of simply piling up volume.

As AI is deeply implemented, Tokens will become a core production expense on par with electricity bills, cloud services, and labor costs. Silicon Valley enterprises are making up for their mistakes in blind expansion, while domestic enterprises are popularizing AI usage through subsidies.

From Token to DAA is a step forward from "how much is burned" to "how much is run". However, no one has really provided an answer to the question of "how much it is worth".

This article is from the WeChat official account "Emphasize Next" (ID: leo89203898), author: Xin Jian, editor: Xiao Bai. It is published by 36Kr with authorization.