HomeArticle

After adopting AI, the company seems to be even poorer.

科技狐2026-06-07 10:11
The company can no longer afford to burn tokens.

When AI first emerged, bosses thought it was an opportunity to lay off employees and reduce costs.

The vision was that one AI could replace three people, work tirelessly, be available on call, not require salary increases or social security, and be online 24/7.

It sounds perfect, but the reality is that while AI doesn't slack off or work overtime, it charges more for each additional task it performs.

As a result, many companies are now exclaiming that they can't afford the tokens.

Many people's first reaction is: Is it really that bad? Isn't AI getting cheaper? After DeepSeek came out, didn't they say the cost of large models had been reduced?

But many people overlook one thing: While the price of models has dropped, companies are using them more intensively.

From occasional use by one person to widespread use by all employees, and then to dozens of Agents running in the background 24/7, the result is that while the cost per call has decreased, the monthly bill has become more and more expensive.

For example, Uber gave 5,000 engineers access to Claude Code, and in just a few months, it almost exhausted its entire annual AI budget.

Microsoft has also started to put the brakes on recently, tightening the internal usage permissions of Claude Code and no longer allowing engineers to call it without limits.

To put it bluntly, the stage of "using it however you want" is over.

Amazon was even more direct and simply removed the internal AI usage leaderboard.

The reason is simple. They found that once "how much AI is used" became a metric, employees would start to crazily swipe tokens for the ranking.

It seems that everyone is actively embracing AI, but in fact, many calls don't produce any results. It's just "using it for the sake of using it."

In a multi - Agent experiment by Mihayou, dozens of Agents called each other, waited for each other, and confirmed with each other in the background. One would ask a question, the other would reply, and then confirm the other. No one really ended the process, and the entire call chain kept getting longer.

Finally, they burned about 2 million yuan worth of tokens in one night, but the actual value produced was negligible.

Seeing this, many people may have a question: What exactly are tokens? Why can they burn a company to such an extent?

Actually, tokens can be understood as the electricity bill in the AI world.

When you ask a question in the chat box and the AI replies to you in a few seconds, it seems free.

But in the corporate background, every time you input a sentence, output a piece of content, call a model, or let an Agent execute a tool, and even when AIs discuss with each other, token consumption will occur.

More importantly, the charging logic of AI is completely different from that of traditional software.

In the past, when buying software, the cost was basically fixed. You could calculate how much an account would cost and what the annual budget would be at the beginning of the year.

AI is different. It is charged according to the usage, and this usage will continue to increase with the complexity of the business.

If an employee asks a few questions occasionally, the cost may not be much. But when the whole team uses it together, the cost starts to rise.

When you connect Agents and let AI call AI, the bill can easily go from a few thousand yuan to hundreds of thousands or even millions.

Unfortunately, in the past two years, the whole society has been encouraging people to use AI more.

To increase AI penetration, usage frequency, and automation, some companies have even written token consumption into the performance evaluation.

There is a classic law in economics called Goodhart's Law: When a metric becomes a target, it is no longer a good metric.

Abroad, they even coined a word called Tokenmaxxing, which can be roughly understood as "maximizing token usage."

Some people let AI optimize the same piece of code dozens of times, and some let AI generate a dozen versions of a report at once.

Some people break a task that can be completed in a few steps into a bunch of Agents collaborating, just to make the system look more intelligent. AI is becoming more and more of a showy but useless thing.

Normally, small - scale usage can be tolerated, but what really pushes the cost to the verge of being out of control is the multi - Agent system.

Theoretically, this system is great: one Agent is responsible for planning, one for execution, one for inspection, and one for summarization, like a digital team collaboration.

But in reality, it's more like a project meeting without a host.

You ask me, I ask you; you wait for me, I wait for you; one round of confirmation is not enough, so we do another round. Everyone is moving, but the task can't be completed.

In most multi - Agent systems, 30% to 60% of the tokens are actually consumed in such meaningless cycles.

To put it bluntly, a lot of money doesn't turn into results. Instead, it burns away during the "meetings" between AIs.

Ironically, these Agents are not slacking off. On the contrary, they are too serious.

They follow the process strictly and execute every step of the logic. One Agent calls another, and then the latter confirms with the former, until the whole system gets into a dead end.

This is a bit like when dozens of people in a company have a meeting from evening to morning in a conference room. Everyone is speaking and very engaged, but no one makes a decision, and the meeting is charged by the second.

The key problem is that these "meetings" will be continuously replicated, split, and nested. Once the scale expands, the cost starts to get out of control exponentially.

Because the cost of AI is never a one - time thing. It will continue to increase with the call chain and is almost unpredictable.

Now, people are no longer discussing "whether AI is useful." Instead, they are calculating a more realistic thing: Will this thing really blow up the bill?

The domestic models DeepSeek and Doubao are suddenly being talked about again, not out of sentiment, but because of a very realistic reason: The same task may cost several times less with them.

To put it bluntly, don't always use the most expensive models. Give simple tasks to the cheaper ones and use large models for complex tasks.

Companies are starting to understand that AI is not a tool that "the more you use, the more powerful it is." It's more like a system that "the more you use, the more money it burns."

The capital market has also changed its attitude. In the past, when looking at AI companies, they looked at who had more calls, who had rapid growth, and who burned a lot of tokens.

Now, they only look at one thing: ROI. Did you get any money back after burning so many tokens?

A harsh reality is that an increase in efficiency doesn't necessarily mean making money.

If you write code twice as fast but don't sell an extra unit of the product, it's just "spending money faster," not making money.

What's even more magical is that this is not just a problem for individual companies.

One company burned 500 million US dollars on Claude in a month, and there was even a case where the limit was not set, and the token consumption soared.

Meta was even more extreme. They had an internal leaderboard called "Claudeonomics" to see who used AI the most. The first place burned 31.2 trillion tokens in a month.

Converted, the money burned in this month is enough to hire two senior engineers for a year.

It can be said that while bosses are shouting "AI for all employees," the finance department is already breaking out in a cold sweat.

Actually, it's not about not using AI. It's about no longer burning tokens blindly.

People are starting to ask a more realistic question: Did these tokens really bring back real money?

This article is from the WeChat official account "Tech Fox" (ID: kejihutv), author: Lao Hu. It is published by 36Kr with authorization.