Anthropic's "The Boy Who Cried Wolf": Those Calling for a Halt Run the Fastest

Filing applications, calling a halt, launching new products — all completed within ten days.

Three days after secretly submitting its IPO documents to the SEC, Anthropic suddenly published a ten-thousand-word article, calling on the world to prepare a "brake" for cutting-edge AI development. Five days later, it launched its most powerful model to date, Claude Fable 5, and simultaneously released the unrestricted version, Mythos 5.

The filing, the call for a halt, and the launch of a new model all took place within ten days.

The article, published in early June, is titled "When AI Builds Itself." It was co-authored by Marina Favaro, the head of research at Anthropic, and Jack Clark, the head of policy and co-founder. Its core concept is "recursive self-improvement," which means that AI can design, train, and upgrade its next generation almost without human intervention.

In the past few years, the popular risk associated with AI has been seen as "AI replacing humans." Anthropic has taken the issue a step further. When AI starts to replace AI researchers, technological progress may no longer just accelerate linearly but could enter a phase of self-acceleration.

Another AI giant, OpenAI, has also joined this discussion about applying the brakes.

On June 8th, OpenAI released a strategic vision document co-signed by CEO Sam Altman and Chief Research Officer Jakub Pachocki. It proposed that an international organization should be established to coordinate the development of leading global AI and, when necessary, "slow down cutting-edge development" to allow social resilience, safety, and alignment research to catch up with technological progress.

On one hand, there is a racer with a valuation approaching one trillion dollars, leading the way in the public market. On the other hand, there is a whistleblower warning the world that "we only have an accelerator, no brake." These two identities are combined in one entity.

So, the question arises: Is this a warning from the conscience, or is the leading player setting the rules for the next stage of competition in advance?

01. What exactly is Anthropic afraid of?

The most disturbing part of Anthropic's long article is that it shifts the location of the risk from the external world to the inside of AI companies.

Inside Anthropic, the more senior an employee is, the more open-ended the tasks they are assigned. For example, a new employee might be tasked with "fixing the broken export button," while a senior employee might have to figure out "why the network slows down under high load," and a higher-level employee might need to answer "what the team should do next quarter."

If we compare AI to an ordinary employee, it is climbing up this pyramid step by step.

In the early days, AI was just a completion tool. Humans wrote code, built systems, and trained models. Later, it could generate code snippets and fix bugs. Subsequently, coding agents started to run code on their own and delegate hours of work to another agent. Anthropic calls this a "closed loop" – in the future, agents may have the ability to build and train models on their own. If so, the subsequent versions of Claude will be continuously improved by Claude itself.

This is "recursive self-improvement."

This is not a science fiction speculation; it comes from Anthropic's own production data.

As of May 2026, over 80% of the code merged in Anthropic's code repository was written by AI. In early 2025, when Claude Code was still in the research preview version, this proportion was only in the single digits. For a typical engineer, in the second quarter of 2026, the amount of code merged per day was eight times that in 2024. An employee said that he hadn't written a single line of code by hand for about five months.

Claude doesn't just write simple code. In the most open-ended and difficult-to-define tasks, its success rate reached 76% in May 2026, an increase of 50 percentage points in half a year. For example, during a routine upgrade, tens of thousands of training tasks crashed one after another. The engineers basically just gave Claude a description and cluster permissions. It tested each environment variable and within two hours, it identified an obscure debugging parameter, reproduced the problem, and fixed it. This would usually take two or three days of work in the past.

What alarms Anthropic even more is the research process. Every time a new model is released, it asks Claude to solve the same problem: modify the code for training a small model to make it as fast as possible without errors. Usually, a skilled human researcher takes four to eight hours to achieve a four-fold speed increase. In May 2025, Opus 4 achieved an average speed increase of about three times, slightly inferior to humans. By April 2026, this number became about 52 times. In less than a year, Claude went from being close to humans to leaving them far behind in this regard.

Connecting these curves shows a path where humans are constantly retreating.

Even within this globally leading AI company, employees often feel a sense of nihilism. An employee said, "On days when everything goes smoothly, I can't help but feel that what I'm doing is no longer important."

What humans can still hold onto today is the so-called research taste and judgment: deciding which problems are worth solving, which results are reliable, which paths are dead ends, and seeing the bigger picture beyond the immediate tasks. However, Anthropic is not optimistic about this last bastion. The so-called "taste" might just be another ability that AI fails at first and then learns.

Based on this, Anthropic has outlined three possible futures: First, the current trend stops, but the existing capabilities are widely spread; second, AI takes over a large part of the development, while humans still hold the steering wheel, and organizational efficiency is multiplied; third, AI has the complete ability of self-improvement and starts to design and train its successors, that is, "recursive self-improvement."

Anthropic says that the third scenario has not arrived yet and is not inevitable, "but it may come earlier than most institutions are prepared for."

So, what should be done?

Anthropic's answer is to hope that the world keeps an option – to slow down or even pause cutting-edge development when necessary, allowing social structures and alignment research to keep up with technological progress.

02. Calling for a pause, but it's hard to pause

Anthropic's proposed pause button has three levels.

Within the company, more stringent safety assessments, red lines, and release thresholds should be established; at the industry level, leading laboratories should coordinate with each other. One company can't apply the brakes while another continues to accelerate; at the international level, governments, scientists, advocacy organizations, and competitors should sit at the same table to discuss how to verify whether the "pause" has really occurred.

It has made its stance clear: as long as such a verification mechanism exists and other leading developers also stop in a verifiable way, it is willing to slow down or even pause together.

However, each premise points to the same underlying message: it won't stop alone.

This is like a prisoner's dilemma.

Anthropic believes that without a global coordination mechanism, companies and governments will make safety decisions under competitive pressure. Training cutting-edge models is not like guarding a missile silo. Computing power can be rented, tasks can be split, and cloud services can be accessed across borders. As long as someone stops, the one who secretly moves forward may take the leading position.

The most direct example is the close competition between Anthropic and OpenAI.

On April 23rd, OpenAI released GPT-5.5 and simultaneously launched the programming assistant Codex, directly targeting Anthropic's Claude Code. A little over a month later, Anthropic upgraded its flagship model to Opus 4.8. The outside world generally believes that this step was forced by GPT-5.5 and Codex.

On June 9th local time, Anthropic launched the Fable 5 version, released the Mythos 5 series of models, and a new generation of agent development tools, continuing to expand the front line into code generation, complex task execution, and enterprise-level workflows. Anthropic claims that Fable 5's capabilities exceed all its previously publicly released models. There were only five days between warning about the potential runaway of cutting-edge AI and launching the most powerful model into the market.

Now, this competition has spread to the capital market.

On the same day as the Opus 4.8 update, Anthropic announced the completion of a new round of $65 billion in financing, with a post-investment valuation of $965 billion, approaching one trillion dollars. On June 1st, it secretly submitted its IPO documents to the SEC ahead of OpenAI. OpenAI quickly followed suit: on June 8th, it announced the secret submission of a draft S-1 prospectus, aiming to go public as early as September and no later than the fourth quarter of this year. The market's valuation expectation is as high as one trillion dollars.

The two companies are competing to become the benchmark for "cutting-edge model companies" in the capital market. In a capital race of this magnitude, no one dares to let off the accelerator voluntarily.

Take Anthropic for example. Its latest valuation has exceeded OpenAI's $852 billion valuation after its March financing, but the user scales of the two companies are not in the same league. In March this year, the monthly active users of ChatGPT's mobile app were about 961 million, while Claude had about 23.5 million, a difference of 40 times. What supports this valuation are enterprise customers and the value per user. According to third-party estimates, each Claude user contributes an annualized revenue of about $808, 30 times that of a ChatGPT user.

Its current valuation, financing ability, and IPO expectations are all based on the same premise: it can continuously launch stronger models and convert the model capabilities into revenue, customers, and platform status. If it really stops unilaterally, the valuation logic will be weakened, enterprise customers will worry about the slowdown of iterations, developers may flow to OpenAI, Google, or other platforms, and talents will also re-evaluate whether they are still at the forefront.

So, what Anthropic is really calling for is not "I'll stop first," but "if everyone can be verified to stop, I'll stop too."

OpenAI has the same attitude.

In the article co-signed by Sam Altman and Chief Research Officer Jakub Pachocki, OpenAI proposed that an international organization should be established to coordinate the development of leading global AI and slow down cutting-edge development when necessary to allow social resilience, safety, and alignment research to catch up with technological progress.

OpenAI is talking about international coordination, not unilateral deceleration; Anthropic is talking about a verifiable pause, ensuring that it will stop only when others stop. Both companies admit that applying the brakes is necessary, but they both set the premise on "everyone applying the brakes together."

Moreover, there are more than just two companies on this track. Behind them are cloud providers, chip manufacturers, sovereign AI, military demands, financial capital, and national strategies. The stronger the model, the longer the interest chain, and any link may become an opponent of deceleration. In such a structure, "pausing" is hardly a simple moral choice. It's more like a high - difficulty multi - party agreement. Everyone has to believe that others have really stopped and that they won't lose the future by stopping first.

It can't stop, but it brings up the topic at this point. So, many people are not questioning whether Anthropic is right, but whether it is sincere.

The criticism from NYU professor Gary Marcus is quite representative. In his view, Anthropic doesn't really want to pause. It's just a rhetorical performance to accompany the IPO, using a "nearly cost - free" gesture to make people seriously discuss an option it never intended to implement.

03. Why shout "wolf is coming" while continuing to accelerate?

Warning about risks while accelerating iterations has almost become the standard practice of cutting - edge AI companies. On the surface, they constantly talk about safety, alignment, regulation, and social preparedness; behind the scenes, they continue to release models, expand computing power, compete for developers, and proceed with financing and listing as usual.

This contradiction has become more and more obvious in the past year.

Anthropic's CEO, Dario Amodei, has frequently warned that in the next few years, AI may gain the ability to change the social structure and even bring about the risk of runaway. He has called on the government and the industry to establish stricter safety mechanisms in advance. However, at the same time, Anthropic has not slowed down. Instead, it has continuously launched more powerful Claude models in a year, constantly refreshing the context length, code - writing ability, and intelligence level, and actively competing for enterprise customers and the developer market.

OpenAI's performance is also typical. CEO Sam Altman has long emphasized the huge risks that super - intelligence may bring. He has repeatedly said that humans need to prepare for AGI and even supports the establishment of a regulatory framework for cutting - edge models. However, in reality, OpenAI has continuously accelerated the product release rhythm, from GPT - 4 to GPT - 4o, and then to the release of stronger models and Agent capabilities. It almost raises the competition threshold every few months.

The onlookers are starting to lose faith.

In the eyes of many, the two leading model companies are positioning themselves as "responsible players" before going public.

Another criticism is more direct. Some media have said that companies like Anthropic want to turn "safety" into a moat.

When a company that is already in the first echelon and about to enter the public market repeatedly emphasizes that "cutting - edge models are too dangerous," the discussion easily turns to higher entry thresholds, more complex evaluation systems, and more expensive compliance costs. For leading companies, these are not difficult to achieve and can even become advantages; but for the open - source community, small and medium - sized teams, and latecomers, they may face the risk of being eliminated.

From this perspective, the question changes from "Is AI dangerous?" to "Who has the right to define danger?" Once the standards are set by a few cutting - edge AI companies, those who are finally blocked out may not be the most dangerous players, but rather the ones with the least say.

This is why NVIDIA CEO Jensen Huang has repeatedly criticized the disaster narratives of Dario Amodei and others. In his view, exaggerating the danger of AI will only amplify social panic.

Behind Jensen Huang's view, there is actually a collision between two types of businesses.

For NVIDIA, the more open and widespread AI is, the more prosperous the chip, developer, and cloud ecosystems will be. For cutting - edge model companies like Anthropic, the more AI is defined as high - risk, the more likely the industry will move towards a licensing system, evaluation system, and high compliance costs, and leading companies are naturally more capable of bearing such costs.

So, what the opponents really mind is that Anthropic occupies three positions at the same time: the leader in the cutting - edge competition, the proposer of the risk narrative, and the potential beneficiary of future rules.

Friedrich Hayek once wrote in "The Road to Serfdom" that when people strive to shape the future according to lofty ideals, they may unknowingly create results that are completely opposite to their original intentions. Applying this to today's AI industry, it almost seems like an accurate prediction.

OpenAI was established as a non - profit organization in 2015 with the mission of ensuring that AGI "benefits all of humanity." However, as the model competition accelerated, computing power costs soared, and the demand for funds increased, it established a "capped - profit" subsidiary in 2019 and was subsequently more deeply involved in commercialization, financing, and platform competition.

An organization that wrote "for all of humanity" in its charter has also been continuously rewritten by competition and capital in ten years.

Anthropic is facing the same situation today. It may truly believe that the AI risk is approaching, and it may also truly want to buy time for the industry. However, in this industry, lofty intentions are hard to stand up against fierce competition alone.

The reality of the cutting - edge AI competition is that whoever stops first may lose the market, talents, capital, and influence. So, all companies know that speed is dangerous, but they all continue to accelerate; all companies discuss applying the brakes, but they all hope that others will apply the brakes first.

Therefore, Anthropic may not be lying, nor is it just putting on a show. It may both fear what it is creating and be unable to stop creating it.

*The pictures in this article are from pexels.

This article is from the WeChat official account "AIX Finance". Author: Chen Dan. Republished by 36Kr with permission.

该文观点仅代表作者本人，36氪平台仅提供信息存储空间服务。

Anthropic "The Boy Who Cried Wolf": The Ones Who Call for a Halt Run the Fastest

01. What exactly is Anthropic afraid of?

02. Calling for a pause, but it's hard to pause

03. Why shout "wolf is coming" while continuing to accelerate?