Microsoft hits pause on vibe coding: Burning tokens is now more expensive than employees
On May 14, 2026, Microsoft has begun to revoke the internal licenses for Claude Code for most employees. The deadline is June 30th, which is also the last day of Microsoft's fiscal year.
Just six months ago, Microsoft was doing the exact opposite. In December 2025, it made Claude Code available to thousands of employees, including engineers, product managers, and designers, encouraging everyone to reshape their workflows through vibe coding. Employees loved this tool, perhaps a little too much.
But six months later, Microsoft withdrew it.
Almost in the same week, Tom Blomfield, a partner at YC, said another thing during a batch talk: "If your API bill doesn't make your heart ache, it means you're not spending enough."
In the same spring, Silicon Valley is offering two completely opposite answers to the same question: Is using AI more expensive than hiring humans?
01
The Failure Scene of Vibe Coding
What Microsoft revoked is not the Claude model. Anthropic's model will still be provided to Microsoft employees through Copilot CLI. What it revoked is the product entry of Claude Code itself.
The most affected department is "Experiences + Devices" - that is, the engineering teams behind Windows, Microsoft 365, Outlook, Teams, and Surface. EVP Rajesh Jha packaged this decision as "toolchain unification" in an internal memo, but internal Microsoft sources cited by The Verge were more straightforward: Employees generally believe that Claude Code is more user - friendly than Copilot CLI. The popularity of Anthropic's tool within Microsoft has even "neglected" Microsoft's own Copilot CLI.
In other words, Microsoft removed Claude Code not because it's ineffective, but because it's too effective.
The June 30th deadline is not a coincidence - it's the last day of Microsoft's fiscal year. Cutting a tool that employees generally prefer, switching back to its own product, and timing it at the fiscal year - end - it's clear to everyone how much is product judgment and how much is financial consideration.
Image source: Visual China
Microsoft is not an isolated case.
A month ago, Praveen Neppalli Naga, the CTO of Uber, revealed to The Information that the company burned through its entire AI programming tool budget for 2026 in the first four months. Uber had previously set up an internal leaderboard to encourage employees to use AI more through competitions - the result was a budget collapse.
More straightforward was what Bryan Catanzaro, the vice - president of applied deep learning at Nvidia, said in an interview with Axios: "For my team, the cost of computing power far exceeds the cost of employees." This came from an executive of a hardware company whose core product is selling computing power.
Fortune strung these clues together and gave the article a very Fortune - style title: "Microsoft's report exposes the real cost problem of AI - using this stuff is more expensive than hiring employees."
If you only read this far, the conclusion is simple: Vibe coding has failed, and the story of AI replacing humans can be put to rest.
But it's too early to draw this conclusion.
02
The Copilot Model Has "Hit a Wall"
To explain Microsoft's retreat, we first need to clarify what vibe coding is.
This term was proposed by Andrej Karpathy in early 2025. He described a new way of programming: Developers no longer write code line by line but describe their intentions in natural language and let the LLM generate the code. Developers don't even read the code; they only look at the results - if it runs, they accept it; if not, they ask the AI to modify it.
This is one of the most alluring productivity promises in the AI era. It means that an engineer who can't write Rust can ask the AI to write it for him; a product manager can ask the AI to create a prototype; a designer can ask the AI to write runnable code. The people Microsoft made Claude Code available to in December 2025 - engineers, PMs, and designers - are exactly these three types of people. This is not a coincidence; it's the most classic implementation of vibe coding.
But when vibe coding is implemented in large companies, it becomes a structurally awkward thing.
Suppose there is an engineer at Microsoft with an annual salary of $300,000. After Microsoft equips him with Claude Code, his output increases by 20% - this is the ideal state of vibe coding. But at the same time, how much does the token cost he burns each month? Is it $200, $500, or $2,000? This number will rise monotonically as his dependence on AI deepens.
What's more troublesome is that he won't be laid off because he "uses AI" - his $300,000 annual salary remains, his benefits remain, and his workstation remains.
That is to say, Microsoft's total cost structure is "the original employee salary + the new token bill". This formula only has one direction - a sharp increase in costs.
Does the "20% increase in employee output" translate to a "20% increase in revenue" in financial terms? No. It means "the revenue remains the same, but there is an additional AI bill in the cost structure" - because the output of most employees does not directly correspond to new revenue. Writing faster doesn't mean the company sells more.
This is the real meaning of Catanzaro's statement "Computing power is more expensive than employees." It doesn't mean that AI is stupid; it means that when you equip employees with AI, you can't make the numbers work.
This logic is also supported by data.
A recent Gartner prediction says that by 2030, the inference cost of trillion - parameter large models will decrease by nearly 90% compared to 2025. It sounds like AI is getting cheaper, but Gartner's real conclusion is that this won't make the total AI bill of enterprises cheaper. Will Sommer, a senior director analyst at Gartner, said - "CPOs should not confuse 'the deflation of commodity - level tokens' with 'the accessibility of cutting - edge inference capabilities'."
Goldman Sachs' prediction is more direct: By 2030, agentic AI will drive the token consumption to increase by 24 times, reaching 120 quadrillion per month. A 90% decrease in the price per token and a 24 - fold increase in consumption - the result is that the total bill is still rising.
Jensen Huang has a more radical version. He said in public a few months ago that in the future, each Nvidia employee will work with 100 AI agents.
It sounds great. But if you're a CFO, what do you hear? It's 100 token - burning furnaces burning 24 hours a day.
The problem is not that AI is too expensive. The problem is the assumption of "equipping each employee with an AI co - pilot" itself.
This approach has a popular name in the tech circle - the "copilot mode". Its core assumption is that humans remain in the driver's seat, and AI provides suggestions from the co - pilot's seat. It doesn't replace you; it just makes you faster.
This assumption is very gentle in words - "AI won't take your job; it just helps you." But in financial terms, its implicit meaning is: All the original salaries remain the same, but there is an additional token fee.
And tokens are not a fixed cost; they are charged according to consumption. The more employees use, the more the company pays - this is exactly the cost structure that enterprises least want to see: floating, uncapped, and inversely proportional to production capacity.
When Microsoft made Claude Code available in December 2025, it may not have fully realized this. It originally thought: Let employees try it and see how much AI can improve work efficiency. But six months later, employees really got addicted, and Claude Code became too popular within Microsoft - the result was that the token bill far exceeded expectations and exceeded the output that Microsoft could get from this popularity.
Microsoft withdrew. But what it withdrew was not AI - it was the structure of "employees in the driver's seat and AI in the co - pilot's seat".
This is a structural failure. It won't disappear because the model becomes cheaper, nor will it disappear because employees become more proficient - it will become more serious as employees become more proficient with AI.
03
Burning Tokens Instead of Firing Employees
Almost in the same week as Microsoft's retreat, Tom Blomfield presented a completely different perspective at YC's batch talk. He didn't discuss "how to use AI"; he discussed "what a company in the AI era should look like".
Blomfield's judgment is straightforward: Most companies today still have a "Roman legion" - style structure - information is passed up step by step, commands are distributed down step by step, and humans are the core of coordination. Equipping this structure with AI is like giving firearms to Roman infantry - they will use them more aggressively, but the tactics won't change.
A truly AI - native company should look different.
Blomfield gave a very specific description: Each action should produce a recordable and callable product, making everything clearly readable to AI; the company should be designed as a "self - improving AI cycle", where the system can sense the environment, make decisions, call tools, receive feedback, and self - correct.
People in such a company only have two roles. One is the individual contributor - everyone, regardless of department, is a builder and an operator. They bring prototypes to meetings, not just ideas. The other is the DRI (Directly Responsible Individual) - each output has a clear responsible person, and "you can't hide behind AI".
Then Blomfield said that golden line: "If your API bill doesn't make your heart ache, it means you're not spending enough."
If this sentence were said in Microsoft's CFO's office, it would be regarded as a joke. But in front of a room full of startup founders at YC, no one thinks it's crazy.
Why?
Diana Hu, another partner at YC, gave the answer at the Startup School in early May. She said, "Maximize not the number of employees, but the token consumption." She also had a more straightforward version: "One person equipped with AI tools is equivalent to a large engineering team in the past."
Note the keyword here: "equivalent". It's not "similar to", not "analogous to" - it's replacement.
Among the P26 2026 spring batch at YC, many companies are already using 5 or 6 people to do what used to require 20 or 30 people. Their token bills are high, but their personnel bills are extremely low - overall, they are making a profit.
A more radical example is Block. Jack Dorsey's fintech company recently laid off 40% of its employees. This is not the traditional "cost - cutting and efficiency - increasing". Block has also increased its internal investment in AI tools. The new structure is what Diana Hu described: IC + DRI + AI agent.
Burning tokens in the YC context is not an expense; it's a replacement. What it replaces is not the expenses other than AI, but the salaries of employees. The numbers work because the company has simultaneously removed the positions that would have required salary payments.
This is the fundamental reason why Microsoft and YC see the same thing but give opposite answers - they are not burning the same kind of tokens. Microsoft's tokens are for refueling the co - pilots of the original staff, while YC's tokens are for replacing the original drivers.
04
The Real Assets Are Being Redefined
During the conversation, Tom Blomfield also said another thought - provoking thing: "People are transient; context documents are important."
This is a judgment at the accounting level.
How is the balance sheet of a traditional company written? On the left are fixed assets, accounts receivable, goodwill, and IP; on the right are liabilities and shareholders' equity. Employees are not in the asset column - they are costs. But every company knows in its heart that employees are actually the real assets: customer relationships are in the salespeople's minds, business intuition is in the product managers' minds, and technical know - how is in the engineers' minds.
The characteristic of this kind of "asset" is that it can leave. When an employee leaves, the asset is gone.
The AI - native company described by Blomfield is doing one thing: extracting all the assets that originally only existed in people's minds and turning them into "context assets" that are readable, callable, and iterable by AI.
What are the specific forms? They are detailed requirement documents; process documents that record every decision, every email exchange, and every Slack discussion; open MCP interfaces and APIs; and artifacts generated by each internal tool. All these things form a new, inheritable asset layer of a company that won't evaporate when employees leave.
In such a company, people become "variables" - they can be quickly connected and quickly leave because the company's core assets are not in people's minds but in documents.
Image source: Visual China
If this structure holds, it means not only a new organizational model but also that the company's balance sheet is being rewritten. An AI - native company with 6 people and a staggering token bill may seem financially unhealthy, but its real assets may be thicker than those of a traditional company with 60 people - it's just that current accounting standards haven't learned how to calculate this kind of asset.
In other words, vibe coding is not dead. It just doesn't belong to traditional companies.
The day Microsoft removed Claude Code was not a day when AI economics failed; it was a day when a way of equipping AI on an old organization was disproven by itself.
In that room full of startups at YC, another way is emerging - they are small, they burn tokens, they don't have an "employee AI usage rate" in their KPI tables, and their CFOs won't panic because of a soaring token bill - because what they are burning is not "the co - pilot of employees" but "the replacement of employees".
In the next few years, all medium - sized companies that are still asking employees to "use more AI" will hit the same wall that Microsoft hit - a structurally increasing token bill.
But the real reason for hitting the wall is not that AI is too expensive; it's that the organization hasn't changed.
And most companies probably won't change anytime soon.