From unlimited usage to employees paying for tokens themselves, internet companies can no longer afford their token bills.
It's been less than two months since all employees were encouraged to maximize their Token usage, but Internet companies have quickly changed course.
On June 5th, Tencent announced an internal adjustment to the AI Token quota. The core change is that the unified quota for all employees has been replaced with a dynamic allocation based on work tasks. The notice clearly states that the total investment will only increase, and for those who can significantly improve efficiency and generate value with AI, their Token quota will be guaranteed. There will be no ranking based on Token consumption, and no anxiety will be created.
The rapid consumption of Token quotas by large companies has even exceeded their own expectations.
In April this year, Praveen Naga, the Chief Technology Officer of Uber, said that the company had exhausted its AI budget for 2026 within four months. Uber's R & D expenditure in 2025 reached $3.4 billion. Meta employees consumed 60.2 trillion AI tokens in 30 days, with a cost exceeding $100 million.
The situation is the same in China. On May 20th, Zheng Yinhe, the person - in - charge of the AI NPC & Gameplay technology team of the "Honkai" series, revealed that an employee built dozens of Agents to collaborate for a project, and ended up burning tokens worth 2 million yuan overnight.
Previously, in order to implement their AI strategies, many companies hoped that employees would make the most of AI tools. Some even ranked employees based on Token usage as one of the criteria for promotion and salary increase. However, after seeing the sky - high Token bills, Internet companies were stunned.
Using 90% of the quota in 3 days, large companies drastically cut Token usage
Tencent's dynamic adjustment of the Token quota was not announced in advance, which caught some employees off - guard. A Tencent R & D staff member said that he didn't have enough tokens. From the time the notice was issued to that day, he found that he only had 10% of the quota left, and using Claude would quickly deplete the remaining tokens.
Tech Planet learned that this adjustment involves all employees, including interns, outsourced workers, and full - time employees. Currently, only the Hunyuan large - scale model is free for everyone. Some people think this adjustment is reasonable. "It's obvious that it's impossible to keep supplying tokens in excessive amounts," a Tencent employee commented.
An outsourced employee in the big - data field at Tencent told Tech Planet that they used to use a points - based system for using large - scale models. With 100,000 points, they didn't pay attention to the specific Token amount, but it was enough for a month's use. Now, outsourced employees can only apply to use the Hunyuan model, which has no Token limit.
However, the Hunyuan model doesn't perform outstandingly among all basic large - scale models. Thanks to its "strong reasoning + 256K ultra - long context" ability, Hy3 preview once topped the OpenRouter global weekly list continuously. But in terms of overall capabilities, especially for complex tasks such as programming, there are still gaps between Hy3 and models like DeepSeek V4 Flash and Claude Sonnet 4.6.
However, the impact of the Token adjustment varies for each department and each person. Some people only have $100 left, while others have more than 10,000 yuan.
A Tencent intern told Tech Planet that before the adjustment, he only had $100. After the adjustment, he has $200. He can use all the advanced models on the market, but $200 is really not enough. When writing code, he can use up to $50 a day. A Tencent AI pre - research game employee said that he currently has 12,600 yuan, while his colleague has 21,000 yuan. Some others said that their Token quota was directly cut in half.
A Tencent back - end R & D staff member said that although the Token quota has been reduced, his group is not affected. If they run out of tokens, they can apply to their superiors.
Previously, it was reported that Tencent issued a Token package worth about 220,000 yuan to each employee. According to the 114,848 employees mentioned in Tencent Group's Q1 2026 financial report, Tencent would need to pay 25.2 billion yuan in fees each year. In comparison, its R & D expenditure in 2025 was 85.75 billion yuan.
But now, even cash - rich Tencent has to start calculating carefully, and this is just a microcosm of the industry. Tech Planet learned that mainstream large domestic companies require employees to use internal large - scale models first. Internal large - scale models are basically free for employees, and some companies even block competing models. However, the output of internal models may still be inferior to that of overseas models.
An employee from ByteDance told Tech Planet that the company doesn't force employees to use AI. "Token quotas are a heavy burden for large companies. There are differences in quotas among different positions and departments in many Internet companies. In ByteDance, if employees in AI - related R & D positions don't have enough quotas, they can apply internally to purchase tokens separately," he added.
An employee from Meituan said that he hasn't heard of any quota restrictions within the company, and his quota is more than enough. An employee from Baidu said that the quota restrictions vary according to different departments.
Breaking the blind worship of Tokens
Large companies may still be hesitating whether to cut Token quotas, but more small and medium - sized Internet companies can no longer hold on.
A cross - border payment company in Guangzhou decided to cut its employees' Token usage: from no upper limit to a monthly per - capita quota of $500. In the previous month, they consumed Tokens worth $400,000.
"This is far from enough," a programmer from the above - mentioned company told Tech Planet. There has even been a situation of employees borrowing Tokens from each other in his company. For example, a back - end developer consumed $370 worth of Tokens in two days, and when his quota ran out, he started borrowing Tokens from others.
Previously, Internet companies of all sizes practiced Token - maxxing, fearing to miss the AI wave. So, employees desperately studied how to consume Tokens. An employee shared that especially back - end programmers developed various packages and skills, and each business had a lot of efficiency - enhancing tools. Some programmers opened several agents at once and could burn hundreds of millions of Tokens in an hour. Before the new rule was issued, some programmers had exceeded their budget by more than $1,000. Consuming 100 million Tokens, if using the current mainstream programming choice Claude Sonnet 4.6, would cost at least 2,000 yuan, and up to 10,000 yuan at most.
In fact, the situation of Token waste does exist. An employee from a new - energy vehicle company said that the company gives him a monthly Token quota of $1,000, but he can't use it up. To consume the quota, he can only use AI to write original novels, such as continuing "Dream of the Red Chamber".
An employee from an established Internet company in Shanghai told Tech Planet that the company used to have no restrictions, but now it has started to manage everyone's Token quotas uniformly. Everyone needs to apply for Tokens through DingTalk approval, and each person's quota ranges from a few hundred yuan to 1,000 yuan.
This situation is becoming more and more common. An employee from a mid - sized Internet company in Beijing said that previously, everyone could use Claude Code without a quota limit, and the company would reimburse the cost. Now, the Anthropic API interface has been opened, and each person has a monthly quota of 1,000 yuan. They are also required to use cheaper domestic large - scale models first.
But the reality is that cheap large - scale models can only handle some simple code - completion tasks. Once faced with complex tasks that require multiple rounds of interaction, they are even less efficient than writing code manually. "I've started to buy my own quota. The 1,000 - yuan quota may not even last a week."
Some companies require full - stack AI, which has led to a significant increase in Token usage. An employee from a game company in Guangzhou said that previously, Tokens were free for all employees. He used nearly 30,000 Tokens in a month, and everyone in the department exceeded the quota. After that, they could only use the DeepSeek model.
A programmer in Shanghai shared on a social platform that there are only four people in his department, but they consumed Tokens worth 60,000 yuan in a month. Now, the technical leader has directly purchased DeepSeek Tokens and asked the technicians to switch.
Another consequence of Token - maxxing is that during the review, many programmers found that they couldn't understand the code they wrote, and even couldn't find where the code was and why it was written that way. Company managers found that even with the use of AI, the overall operational efficiency didn't improve. When large - scale models needed to be queued, it even affected the progress of important products.
Robin Li, the founder of Baidu, first proposed the concept of Daily Active Agents (DAA) at this year's AI Developer Conference. DAA roughly corresponds to the Daily Active Users (DAU) in the mobile Internet era. It seems that DAA is a better measure of the real prosperity of the platform and ecosystem than simply looking at Token consumption.
From the unrestrained waste with no upper limit to the current "quota system" and "domestic substitution", Internet companies' blind worship of AI is going through an inevitable process of disenchantment.
This article is from the WeChat official account "Tech Planet" (ID: tech618), author: Wang Lin, published by 36Kr with authorization.