HomeArticle

Big tech companies are no longer providing unlimited Tokens: Tencent has imposed usage limits, and ByteDance allows partial reimbursement for related expenses

36氪的朋友们2026-06-15 18:40
When tokens are linked to cost and production capacity, how should enterprises define and allocate their quotas? Major tech companies are still searching for the answer.

Since June, a Tencent employee noticed that the Token quota assigned to them on the internal management dashboard had decreased. "Previously, I had a monthly quota of $2,000 (about 13,500 RMB), but this month it's only 1,400 RMB, and it was used up in just two days."

According to an incomplete statistics by Economic Observer, currently, there is a large difference in the average monthly Token quota per employee among different departments at Tencent, ranging from 1,000 RMB to 7,000 RMB. After the Token quota is allocated to the group, the group manager will then distribute it to individual employees. When the quota is insufficient, employees can "raise their hands" to the manager to apply for an increased quota.

Regarding employees' Tokens, currently, there are mainly two allocation methods among major domestic tech companies: One is to allocate the quota to individual employees. If there is an excess demand, employees can be partially reimbursed after paying out - of - pocket. The other is to allocate the quota to the department as part of the department's budget, and the manager will distribute it within the department.

No matter who the Tokens are allocated to and how they are allocated, ultimately, it's all about the money used to buy Tokens. The involvement of Agents in work has led to an exponential increase in Token consumption, and the computing power cost is also squeezing the profits of major tech companies. By mid - 2026, top global tech giants at home and abroad, including Microsoft and Meta, have started to put the brakes on the unlimited internal use of AI, monitoring, restricting, and dynamically adjusting employees' AI Token usage.

After the quota was reduced, some employees expressed concerns about returning to "traditional programming." Employees are facing the pain of moving from a more abundant situation to a more frugal one: Should they go back to manual coding, or pay out - of - pocket to increase the quota and work at their own expense?

The "Equal - Sharing System" is Over

Tencent is one of the first domestic Internet giants to control employees' Token quotas. According to Economic Observer, since June, the Token quotas of employees in multiple Tencent businesses have decreased, and there are significant differences between departments. In the Hunyuan large - model team with high AI demand, the monthly Token quota per employee is about 7,000 RMB. In the YouTu Laboratory focusing on the field of computer vision, the quota is about 5,250 RMB. Another outsourced employee from Tencent Entertainment revealed that their monthly Token quota is only 1,000 RMB.

"The group shares a quota pool, and the team leader distributes it." The aforementioned Tencent employee said, "This is a temporary measure for this month, and it may be changed next month."

In March this year, there was a message on the Maimai community saying that Tencent had allocated "220,000 RMB worth of Token resources per person per year" to employees, including a monthly Cursor quota of $700, a Claude quota of $700, and a CodeBuddy quota of $1,000, etc., to encourage employees to use AI to improve efficiency. This message was confirmed by many Tencent employees.

While there was an AI frenzy, the speculation about "whether Token usage is related to work input" became more and more intense. At the end of March, a Tencent employee posted in the Maimai colleagues' circle saying that some businesses were statistics the Token usage of each department and team and ranking them. Some employees were worried that their Token consumption was not sufficient, so they set up meaningless workflows during work hours, let Agents repeat tasks, handle personal needs, and even "take on private jobs" to ensure that their Token usage did not fall behind.

This time, Tencent's adjustment of employees' Token quotas aims to change the previous "equal - sharing system" model that used Token consumption as the sole measurement standard. Economic Observer learned that in 2026, Tencent will continue to increase its Token investment, but it will no longer be allocated to employees according to a unified standard. Department managers will dynamically allocate resources based on work situations. If there is a need, employees can apply for an increase. Internally, ranking by Token usage is opposed, and employee output is not simply measured by Token consumption.

What if the Token Quota is Exceeded?

Besides Tencent, the logic of Token quota allocation in other Internet giants varies.

In positions with a high degree of AI usage such as R & D, Alibaba employees have a monthly quota of about 8,000 RMB, with no restrictions on models. Employees say it is "basically enough" to handle daily needs. JD employees can call their own models without limit, and the fees generated by calling external models are shared by the department. An employee from Meituan said that they haven't heard of a clear Token quota standard, but when using internal AI products, they often encounter the situation of "the model becoming stupid" and suspect that they are "downgraded to a lower - quality model" due to excessive calls.

At ByteDance, employees can call the models in TRAE (ByteDance's self - developed AI IDE product) without limit, including GPT, Gemini, Grok, etc. If there is a work need to call other models, the generated fees can be partially reimbursed. The reimbursement standard for some departments is 50% of the actual expenditure. The annual reimbursement limit for R & D positions is $1,000, and for other positions, it is $300.

Regarding the allocation of employees' Token quotas, major tech companies are sending the same signal: AI should be used, but Tokens need to be managed; otherwise, the cost may get out of control.

On May 20th, at the 2026 Alibaba Cloud Summit, Zheng Yinhe, the person - in - charge of the AI NPC & Gameplay technology team of Mihoyo's "Honkai" series, shared the team's experience in exploring AI: Some employees set up dozens of Agents to collaborate, and burned about 2 million RMB worth of Tokens in one night.

A R & D staff member from an AI startup told Economic Observer that his team of about 50 people had a Token cost of about $200,000 in the past month, with an average cost per person reaching $4,000. "It's mainly used for coding, and models with high - quality coding are all expensive. The boss asked us to save some, and we're considering switching to cheaper models later."

At the beginning of June, OpenAI CEO Sam Altman said in a live broadcast that AI expenditure has become a major problem for enterprises, and "at the beginning of the year, people were still very satisfied with their spending."

Switch to Lower - Cost Models or Pay Out - of - Pocket to Top up the Quota

"It's written on the dashboard that if the Token is not enough, you can go to the person - in - charge to increase the quota." A Tencent employee said. His monthly Token quota was about $3,000, but after the dashboard was updated in June, the quota was only 5,000 RMB. "It was used up in three days after it was issued. Once the Agent and Subagent start running, the quota is used up quickly." So he applied to his supervisor for a quota increase, but the feedback he got was that the department's budget was limited, and the increase was rejected. "The upper limits of each department are different, depending on the business situation."

Tencent's current Token quota adjustment mainly targets external models. Employees can still use Tencent's self - developed Hunyuan large model without limit, which has led to an increase in the usage of Hunyuan. Some employees said that after switching back to Hunyuan, their work efficiency decreased and the user experience became worse. "There are serious hallucinations, and it's not a model dedicated to coding. It's not as good as manual coding."

Struggling with the company's insufficient quota, some Tencent employees choose to pay out - of - pocket to subscribe to Codex Pro 20x at $200 per month. He calculated: "If using Hunyuan, it takes a long time to meet basic needs, and if it fails, you have to run it again, which wastes time. Although I don't want to work at my own expense, it's better than having no quota. The price is okay."

Even at ByteDance, where the quota allocation is relatively abundant, model calls are not unrestricted. "Most of my colleagues and I use GPT - 5.5. The model interfaces in the company always have long queues." A ByteDance R & D staff member revealed that even though this model is within the company's paid scope, some employees still subscribe with their personal accounts to avoid delays in work progress due to model queuing.

After the quota is restricted, it's difficult for employees to return to the pure manual mode before AI intervention. "The workload has increased and hasn't decreased just because the Token quota has been reduced. And after using AI, it's hard for me to go back to traditional programming." A Tencent employee is troubled by the insufficient Token quota.

When Tokens are related to cost and productivity, how should enterprises define and allocate the quota? Major tech companies are still looking for the answer.

This article is from the WeChat official account "Economic Observer", author: Liu Sixuan, published by 36Kr with authorization.