HomeArticle

Anthropic issued an apology, but the "safety" business is far from over

AI唱反调2026-06-12 08:13
Anthropic's "safety persona" is actually a fig leaf for suppressing competition.

On June 11th, Anthropic apologized. It wasn't that the model malfunctioned; instead, they apologized for "failing to strike the right balance." The newly released Claude Fable 5 played a sneaky trick. Once it recognized that you were using Claude for cutting - edge model development, it quietly redirected the requests to the weaker Opus 4.8 in the background, all without any notice.

After being caught, Anthropic's explanation was rather absurd: they'll inform you in the future when they dumb down the model.

Netizens retorted sharply: "With this kind of operation, are you going to give a heads - up before changing your ways?"

Actually, the core of the problem isn't whether the model has changed. Instead, Anthropic's so - called "safety" is, from start to finish, just a business.

The stance of algorithms always sways with money.

Competition Defense Disguised as Security

The incident started when Anthropic introduced an "intelligent security classifier" when Fable 5 was launched. The official statement was that it could detect high - risk requests and automatically downgrade to protect users.

What does "high - risk" mean? Anthropic revealed: "Prevent foreign competitors from using the model to accelerate R & D and protect our leading position."

Users don't need your protection. The terms in the disclaimer are enough to protect them. What Anthropic really means is: if you use Claude for AI research, you're taking their business. Safety is just a guise; the essence is competition defense. In short, it's all about tactics.

Even more remarkable is that this defense mechanism is very covert. Fortunately, in the apology statement, Anthropic finally told the truth: "Invisible security restrictions can more precisely target specific goals, allowing us to release products quickly with a very low false - alarm rate."

AI researchers are the ones who are precisely restricted.

Now they've been forced to make it "visible" just because of the incident. They even gave a warning in advance: "There will inevitably be more false alarms" after making it visible. It means that ordinary users have to bear the consequences.

This set of rules has never been neutral; it only protects the money - backers.

The Three - Step Strategy of Creating Hype, Monetizing, and Reaping Profits

Anthropic's approach is even more shrewd than the large models themselves.

On June 10th, they released a security research report. They trained a model that could reverse - engineer exploit code based on security patches within a few hours. What used to take hackers days or even weeks to weaponize N - day vulnerabilities can now be done within hours. The research is quite advanced, but when it was released on the same day as Fable 5, things took a different turn: on one hand, they proved that AI is very insecure, and on the other hand, they sold a "safety net."

The "legendary model" Fable 5 is priced at $10 for input and $50 for output, which is more expensive than Opus 4.8. The security classifier has become the core value - added point. The capital market is even more cooperative. Anthropic is valued at $965 billion and plans to go public in October, with Goldman Sachs and JPMorgan Chase as joint underwriters. What investors are buying isn't the model parameters but the image of the "safest AI company."

The research amplifies anxiety, the product reaps the premium, and the capital realizes the profit. These three things follow the path of profit, forming a seamless closed - loop. The only problem is that this closed - loop has a leak: They were too eager to restrict competitors and forgot that there are people in the community who can detect it.

OpenAI Sells Tools, Anthropic Sells Anxiety

Compared with OpenAI, the approach is completely different.

OpenAI is secretly submitting for an IPO, with a valuation approaching one trillion. Their story is about a "super - app": ChatGPT has 900 million weekly active users and has integrated with Visa to build an ecosystem. The logic is straightforward: provide tools and earn traffic. It's greedy but honest.

Anthropic doesn't compete on scale but on irreplaceability. Since the entire industry is anxious about security, it portrays itself as the "only responsible adult." Its backers are the government and large corporations. These people are most afraid of things going wrong and are most willing to spend money to ensure safety.

So Anthropic must keep AI in a Schrödinger's state of being "dangerous but controllable." If it's too safe, the classifier won't sell; if it's too dangerous, customers will be scared away. The best solution? Keep the power to define "danger" in its own hands.

The incident of dumbing down the model just took this logic too far: the boundary of "danger" has been extended to "using Claude for AI R & D." It doesn't matter whether your research is harmful or not; threatening my leading position is a sin.

AI doesn't have any values; it's just the boss's business calculations written in code.

Apology Is Just After - Sales Service for Business

What happens after the apology? Instead of secretly dumbing down the model, they'll give a notice before doing so.

Netizens see it clearly: "Do you really believe it won't secretly reduce the output quality in the future?"

Once trust is broken, it's broken. Moreover, the commercial nature remains the same: the research still amplifies anxiety, and the product still reaps the premium.

According to a report by The Wall Street Journal, OpenAI is considering a significant price cut to snatch customers from Anthropic. A price war isn't surprising, but this incident exposes a hidden truth: it's AI researchers who are being downgraded invisibly, which damages the reputation in the geek community. When B - end customers buy from Anthropic, they're not buying the parameters but the image of "the most security - savvy in the industry." Once this image cracks among core developers, why should government and corporate customers who sign contracts for the "security premium" continue to believe that you're "the safest one"?

How much of the $965 - billion valuation is real strength and how much is just for show?

Anthropic's code is quite honest. The security classifier always protects the market, the research amplifies anxiety, the product reaps the premium, and the IPO realizes the profit. This apology is just a patch for the system: changing "secretly dumbing down" to "openly dumbing down."

If the security strategy really worked, Anthropic wouldn't have to publish papers every year to prove that the patches can be penetrated. If the classifier were really neutral, AI R & D wouldn't be classified as high - risk.

The answer is already written in the business logic.

Safety is the best business. Apology is just after - sales service for business.

This article is from the WeChat official account "AI Contrarian", written by Changqing, and is published by 36Kr with authorization.