Why is the AI service subscription model doomed to disappear?
On June 9th, Anthropic released its most powerful public model to date, Claude Fable 5. As usual, this should be a festival for paying users - the money you pay each month finally earns you the privilege of getting your hands on the flagship model first.
However, a line in the announcement immediately sparked huge controversy after the release: After June 22nd, Fable 5 will be removed from all subscription plans, and continued use will require the purchase of usage credits separately.
In other words, even if you've bought a membership, you can only use the flagship model for 14 days.
It's the first time in the large model industry that a model comes with an implicit "eviction notice" on its release day.
Many people regard this as a mistake or an act of arrogance on Anthropic's part. I think the opposite: This is not a mistake, but a preview.
The AI subscription model is heading towards an inevitable demise - not because any company is greedy, but because the premise on which the subscription model is based is being torn down by AI itself.
The Flagship Model with a 14-Day Countdown
Let's clarify the facts first. According to Anthropic's official arrangement (June 9th, 2026), Fable 5 is included for free in the Pro, Max, Team, and seat-based enterprise editions from the release date until June 22nd. Starting from June 23rd, it will be removed from these plans, and every token used thereafter will be deducted from the prepaid usage credits at the same rate as the API.
This rate isn't cheap: $10 per million input tokens and $50 per million output tokens, exactly twice that of the previous flagship, Opus 4.8. What's more subtle is that even during the free trial period, Fable 5 counts at approximately twice the weight in the subscription quota - for the same task, it burns through the quota at twice the speed of Opus.
It's easy to imagine how users reacted. Someone on Hacker News bluntly said that this "give then take" operation was disturbing and suspected that Anthropic was trying to push subscription users towards pay-as-you-go. Some developers also conducted tests and found that on the $100-per-month Max plan, a single agent programming session consumed nearly $100 worth of tokens.
Users took to social media to complain that their token usage was completely insufficient | Image source: Twitter
Moreover, this isn't just an isolated move by Anthropic. In the past eight weeks, the entire industry has been doing the same thing: On April 2nd, OpenAI changed the billing method for Codex from per-message to per-token, aligning with the API, and later extended this to all existing enterprise customers.
On April 20th, GitHub froze new registrations for the Copilot personal edition. A week later, it announced a full switch to AI Credits billing, which was completed on June 1st - the Pro plan costs $10 per month and comes with $10 worth of credits.
Anthropic has been the most active: Starting from April 4th, it prohibited third-party agent frameworks like OpenClaw from consuming subscription quotas, and such usage now follows a pay-as-you-go model. On April 21st, the Claude Code column for the Pro plan on the pricing page quietly turned into a red cross. After the community erupted, it was reverted within 24 hours, and the official explanation was "a small test for about 2% of newly registered users." On May 14th, it was officially announced that starting from June 15th, the Agent SDK and interface-free calls would be removed from the subscription pool and would be billed using independent credits based on the API rate.
Three companies, eight weeks, the same direction - this isn't a coincidence. It's the entire industry giving the same answer to the same mathematical problem.
What does that mathematical problem look like?
It's Never the Computing Power That's Being Priced
The research firm SemiAnalysis recently brought this mathematical problem to the surface. They subscribed to each tier of Anthropic and OpenAI's plans, ran long-term programming tasks until they exhausted the weekly limits, and then calculated the value of these usages based on the API prices.
Previously, the general understanding in the industry was that a $200-per-month package could at most generate about $2000 worth of tokens. The actual test results far exceeded this: The $20 Claude Pro plan has an upper limit of about $400, and the $200 Max 20x plan has an upper limit of about $8000. The situation at OpenAI is even more extreme - the $20 ChatGPT Plus plan can generate about $700 worth of tokens, and the $200 Pro 20x plan can generate about $14000 worth of tokens.
The subsidy multiple for the highest tier is 70 times | Image source: SemiAnalysis
Two fair points need to be made: This is the upper limit when the quota is fully used, not the daily usage level of ordinary users. The API price includes a profit margin, and the converted figures don't equal the actual computing power cost. However, pricing must account for the upper limit - an insurance company can't assume that no one will make a claim.
SemiAnalysis's actual test comparison of the consumable usage of each subscription tier | Image source: X @kimmonismus / SemiAnalysis
The subsidy itself isn't fatal. Streaming services and ride-hailing apps have both offered subsidies. Burning money to drive growth is an age-old practice in the internet industry. What's truly fatal is that there's a fundamental difference between the AI subscription model and them.
Netflix can offer monthly subscriptions because of two things: The marginal cost of adding one more movie is close to zero, and a person has at most 24 hours a day to watch. The same goes for Spotify. The implicit premise for the monthly subscription model to work is that consumption is limited by human physiological limits - what's really being priced isn't the content, but human time.
In the era of chatbots, AI barely meets this premise. No matter how talkative a person is, there's a limit to how much they can type in a day. The unused quotas of light users are enough to cover the excessive consumption of heavy users.
Then, Agents arrived.
What does an agent task look like? It reads 20 files, makes plans, modifies code, runs tests, reads error messages, and then iterates - after one round, the token consumption is 5 to 30 times that of an ordinary conversation. What's even more troublesome is that it doesn't require your presence. I have personal experience: I recently asked an agent to organize flight data from two airports. While I took a shower, the task was completed when I came back, and my quota was exhausted. You're sleeping, but the meter is running.
Agents don't eliminate the price ceiling; they eliminate the consumption ceiling. And all the evolutionary directions in the AI industry - longer tasks, more autonomy, and multiple parallel instances - are all rushing towards the same end:
Completely removing humans from the consumption process.
GitHub was very straightforward in its announcement, saying that agent usage "is becoming the default." In other words, the scenarios where the subscription model can still barely work, such as a person sitting in front of the screen chatting sentence by sentence, will account for an increasingly smaller proportion in the AI value landscape.
At this point, someone might ask: If the subsidy is too deep, can't we just raise the price?
They've tried, and the result was even worse. Looking back at the table from SemiAnalysis, there's an abnormal detail: The more expensive the tier, the higher the subsidy multiple.
For Claude, the multiple for the $20 tier is 20 times, and for the $200 tier, it's 40 times. At OpenAI, it goes from 35 times to 70 times. Half of this is due to the pricing design - higher tiers offer larger quotas in multiples, which is equivalent to giving discounts to large customers. The other half is due to user behavior - those who are willing to spend $200 on the 20x package are aiming to use up the quota, and light users won't choose this tier.
This has a name in the insurance industry: adverse selection. When the pricing of an insurance policy attracts only the highest-risk policyholders, that policy has no actuarial viability. Any fixed price will precisely screen out the users whose usage exceeds it - this isn't an operational problem but a structural one. Adjusting the price will only make the screening more precise.
Throughout 2025, the industry actually tried all possible patches. In January, Sam Altman admitted on X that the $200-per-month ChatGPT Pro was losing money because the usage far exceeded expectations - the price increase attempt failed.
OpenAI tried but failed | Image source: X
In the middle of the year, Cursor changed from per-request billing to per-computing-power billing, which led to a large number of cancellations, and the CEO publicly apologized - changing the rules midway failed. In the summer, Anthropic imposed a weekly limit on Claude Code because some users were running agents around the clock, and a single user's computing power consumption was in the tens of thousands of dollars - limiting the usage only attracted anger.
After all the patches failed, there was a collective showdown in these eight weeks this year. Nick Turley, the head of OpenAI's ChatGPT, made it clear on the BG2 podcast: "In the current era, offering an unlimited package might be like offering an unlimited electricity package."
The Shell Remains, but the Core Is Dead
Of course, there's a seemingly strong counterargument: The subscription model is clearly doing fine. ChatGPT Plus is still $20 per month, Claude Pro is still on sale, and GitHub's code completion even retains the monthly subscription. Is the so-called demise just alarmist?
This counterargument deserves serious consideration because the phenomenon it describes is real. However, it misidentifies what's dead.
The soul of the subscription model has never been the form of "deducting money once a month," but the promise of "a fixed price and worry-free use" - you don't have to calculate the cost of each use, which was exactly the reason it defeated pay-per-use back then.
And what's happening now is: The billing cycle remains, but the promise has been taken away.
The $10 monthly fee for GitHub Pro includes $10 worth of credits, which are used up once they're gone - this isn't a subscription; it's a prepaid recharge card in the guise of a subscription. Anthropic's credits are deducted at the API rate, and OpenAI's credits support automatic recharge. The subscription model won't be cancelled; it will be hollowed out: the shell remains, but the core is dead.
GitHub Copilot's official announcement of the switch to AI Credits billing | Image source: GitHub
There's still one true enclave: pure chat. It can still be offered on a monthly subscription because it's the last scenario in AI where consumption is still limited by human time. However, the moat can't protect the enclave - every dollar of R & D in this industry is pushing AI from "you ask, it answers" to "it actively helps you complete." Chat subscriptions won't be killed; they will be marginalized: staying in place and watching the real value and real revenue gradually move into the pay-as-you-go world.
There's also a hard-to-ignore coincidence in terms of timing: According to TechCrunch (June 2026), when Fable 5 was released, Anthropic was preparing for an IPO along with OpenAI. In the past three years, the subsidies were paid for by venture capital. Public market investors won't accept an income statement that says "the more heavy users there are, the more money is lost." The schedule for capital to exit determines that the showdown won't be postponed indefinitely.
This means different things to different people. For enterprises, AI spending will now have to be managed like cloud spending - according to The Information, Uber's CTO said in an internal memo that the company burned through its entire 2026 AI budget in four months. Making budgets, installing monitoring systems, and routing models by task will become mandatory for every team. For individual users, in the past, light users subsidized heavy users. Now, everyone pays for their own meter.
Uber's AI budget transformation also caused quite a stir | Image source: The Information
To be honest, this isn't necessarily all bad. After the price signal returns, "Is it worth having AI run this task?" becomes a real question for the first time - and when an industry starts to seriously answer this question, it often marks the beginning of moving away from the money-burning narrative and towards a normal business.
At this point, I'd like to add a note: Before the meters are installed, the current subscription model might be the most generous time this industry has ever been to users - use it while you can and cherish it.
The logic is hidden in the table from SemiAnalysis. From the user's perspective, it's not a death sentence at all, but a list of still-valid benefits: You pay $200 per month, and the platform burns up to $14000 worth of computing power with you. This kind of large-scale subsidy last appeared during the ride-hailing and food delivery wars - and we all remember the endings of those two wars. After the subsidies ended, the prices never went back down.
So, take advantage of the current situation to run those heavy tasks. For example, Fable 5 will only be available in the subscription until June 22nd. Instead of waiting for the credit era to come and then being frugal, it's better to arrange those long tasks that you've always wanted to run but thought were too expensive. This isn't taking advantage - it's just being a clear-headed