StartseiteArtikel

Coinbase CEO: 80 % der KI-Arbeitslasten werden in 12 bis 18 Monaten von Modellen erledigt, die 99 % günstiger sind

36氪的朋友们2026-06-10 08:58
Gebührenreform für KI und sprunghaft angestiegene Kosten für Rechenleistung!

GitHub Copilot's new fee policy, which is calculated based on tokens, has caused the monthly bills of some users to skyrocket from $44 to $847. This has triggered the collapse of the "subsidy for growth" model in the AI industry. OpenAI's profit margin is close to -122%, and Uber has exhausted its entire annual AI budget in just four months... When investors' patience reaches its limit, the leaders of Coinbase and Hugging Face have already given the answer: Cheap models with 99% lower costs and open, small models may take on 80% of the AI workload.

The price transformation of GitHub Copilot triggers a chain reaction in the AI industry, and a deep debate about the sustainability of the AI business model comes to the surface. With the adoption of usage-based billing instead of a fixed subscription, users' bills skyrocket. However, the leaders of tech giants like Coinbase and Hugging Face offer completely different solutions. The rise of cheap models may fundamentally reshape the cost structure of AI computing power.

On June 1st, GitHub Copilot, which is owned by Microsoft, officially changed the billing method from the number of requests to the number of tokens used. It is expected that the monthly bills of some heavy users will rise from a few dozen dollars to several hundred dollars. This change quickly triggered strong reactions on social media. Some users posted screenshots of their internal cost estimates. It shows that their monthly costs will jump from $44.68 to $754.29, and other users expect bills of $847.

Behind this price crisis lies the concentrated explosion of the "subsidy for growth" model that the AI industry has long pursued. Brian Armstrong, the leader of Coinbase, responded to this. He predicts that 80% of the AI workload will be shifted to models with 99% lower costs within 12 to 18 months, and that energy and computing power will be the real bottlenecks.

Clement Delangue, the leader of Hugging Face, cited the research data of Stanford University to provide empirical evidence for the extensive substitution by local, open, small models.

01

The price transformation of GitHub Copilot: The end of the subsidy era

GitHub Copilot's price adjustment was not a sudden measure. In April this year, Mario Rodriguez, the chief product manager of GitHub, publicly stated that the current pricing model "is no longer sustainable" in the face of the rise of agent AI. Previously, the same fees were charged for a short dialog request and an autonomous programming task that lasted for several hours, while GitHub secretly took on the constantly rising inference costs.

The new policy officially took effect on June 1st. In the new billing system, the usage fees are converted into AI points based on the AI models used and the number of tokens consumed. Each point is equivalent to $0.01. Subscribers receive a fixed basic point rate and additional flexible points depending on the subscription level. Since advanced AI models usually consume more tokens, the actual costs vary significantly between different models.

Users' reactions were quick and strong. In the GitHub Reddit community, a user who claims to have been a Copilot Pro+ subscriber since day one wrote: " $39 per month already seemed expensive to me, but it was worth it. Now, with this new AI point system, I've calculated, and I expect a bill of $847 next month." Many users compare this change to Uber's business approach - creating user dependence with extremely low prices and then significantly increasing the prices once users are used to it.

Arun Chandrasekaran, an analyst at Gartner, said in an interview with Business Insider that the Copilot case "might just be an early example." He expects that with the strong increase in computing power consumption on the inference side due to advanced inference models and agent workflows, more companies will switch to token-based or usage-based billing.

02

The systemic risk of the subsidy model

This price crisis reflects the deeper structural contradictions in the AI industry. Investor Tommy Shaughnessy wrote a post on social media, systematically analyzing what he sees as "the most obvious collapse route of AI."

He points out that the fixed fees for seat subscriptions have long been heavily subsidized and are far below the actual costs for heavy use. Once companies switch to API calls for reasons such as data protection, compliance approval, etc., they will be confronted with the real prices of usage-based billing, and the actual usage rates are often much higher than previously expected. He cited several cases to support this trend, including that Uber exhausted its entire annual AI budget in just four months in 2026.

Shaughnessy also points out that the profit margin of large AI companies is currently strongly negative - it is reported that OpenAI's profit margin is close to -122%. This means that they are completely dependent on external capital sources to buy GPUs, train models, and continue to subsidize usage. He believes that once investors doubt the expected returns, the entire capital flow system is at risk of reversal.

However, he also points out the limitations of this logic: If AI actually brings about the development of new drugs or completely new business models, users' willingness to pay for expensive AI services will increase significantly, and the above pressure may then ease.

03

The leader of Coinbase: Cheap models will dominate the future

In the face of the continuous increase in computing power costs, Brian Armstrong, the leader of Coinbase, has given his evaluation framework. He believes that the demand for intelligence is almost unlimited, but the market will quickly differentiate: 80% of the workload will be shifted to models with 99% lower costs within 12 to 18 months, and the remaining 20% of tasks that place the most demanding requirements on the intelligence limit - such as scientific breakthroughs and high-level agent coordination - will still run on the latest generations of advanced models.

Armstrong compares this trend to the consumer electronics market: The number of users who buy the top model of a MacBook or a gaming PC always remains small, and the price decline in the AI industry is even faster than Moore's Law. From this, he concludes that the real limitations in the future will be energy and computing power, not the capabilities of the models themselves.

Armstrong has also revealed Coinbase's internal practice: The company is actively implementing a prompt forwarding strategy to distribute requests to cheaper models. In some scenarios, the overall cost base has remained almost the same, while token usage continues to increase exponentially.

04

Open, small models: Empirical evidence for a future world with multiple models

Clement Delangue, the leader of Hugging Face, cited the research data of Stanford University to provide a quantitative basis for the substitution potential of cheap models: The accuracy of local models in real dialog and inference queries has increased from 23.2% in 2023 to 71.3%, and both the costs and energy consumption are only a fraction of those of advanced APIs.

From this, Delangue derives the assessment of a "future with multiple models": For most workloads, local, open, small, and cheap models will be the preferred choice; only when there is no other option will it be necessary to call advanced APIs.

Shaughnessy's analysis is in line with this idea. He points out that DeepSeek V4 achieves similar results to Anthropic Claude Opus in the SWE - bench programming benchmarks, but the price is only about one - thirtieth of it; the price of the cheapest open model is even only about one - hundredth. He believes that the continuous release of advanced models by Chinese laboratories enables inference service providers to obtain the most important model costs for free, which fundamentally suppresses the pricing policies and profit prospects of large closed - source AI companies.

This article is from the WeChat account "Hard AI", author: Zhao Ying, editor: Hard AI. Published by 36Kr with permission.