Just Now: Anthropic Issues an Apology

No longer restrict AI research "in secret"

After a full day of public opinion fermentation, there seems to be a reversal in the incident of Anthropic's new model showing reduced intelligence.

Just yesterday, Anthropic released their new model, Claude Fable 5. The model is very powerful, and its strength is beyond doubt. However, it quickly sparked a wave of criticism in the AI research community. The reason is simple: if Claude Fable 5 is used for AI research and development, it will show reduced intelligence.

Moreover, this reduction in intelligence occurs secretly. That is to say, if Anthropic's system detects that you are conducting AI research, it will secretly make the model less intelligent without your knowledge, and you won't even notice.

In response, Anthropic claims that this is to prevent foreign adversaries from using the model to accelerate AI research and development and to protect its own leading edge.

This move completely enraged the entire community, forcing Anthropic to respond urgently.

Under pressure, just now, Max Zeff, a reporter from Wired, revealed that Anthropic is revoking this policy. The media obtained a statement from Anthropic, which reads: "We are adjusting the security restrictions of Fable 5 for cutting - edge LLM development to make them visible."

More specifically, the protective measures of Claude Fable 5 for AI development will be visible to users. If the company suspects that a user is trying to use Claude to build a high - capability AI, it will issue an alert to the user, indicating that it will either reject the request or direct the user to a less capable model.

That is to say, if Claude Fable 5 detects that a user is engaged in AI research and development, it will still show reduced intelligence, but this time the user will be notified that the intelligence has been reduced, rather than it happening "secretly".

In addition, Anthropic also apologized in this statement: "We made the wrong trade - off. We deeply apologize for failing to strike the right balance."

While this Wired article sparked heated discussions on X, Anthropic also issued an official statement through its Claude Devs account.

The specific content is as follows:

We are introducing some changes to make the security restrictions of Fable 5 for cutting - edge LLM development visible.

Starting this week, flagged requests will be visibly downgraded to Opus 4.8, which is the same as our security restrictions for the network and biological fields. You will see this happen every time. On the API, any flagged request will return the reason for its rejection (the server - side fallback mechanism will be launched in the next few days).

We want to deploy Fable 5 to users quickly and safely. Visible security restrictions can be probed, so they must be robust enough, and achieving this takes time. Invisible security restrictions can be more precisely targeted at specific goals, enabling us to release quickly with a very low false - positive rate. We chose invisible security restrictions for this reason, but it was not the right trade - off. You should know what security restrictions we have set and the reasons behind them. We deeply apologize for failing to strike the right balance.

Making security restrictions visible makes them easier to bypass. Therefore, in order to maintain their resistance to "jailbreaking" attacks, more false positives will inevitably occur while we improve the classifier. We are also adjusting our biological and network classifiers to reduce the frequency of triggering on harmless requests. We know this is frustrating, and we will do our best to minimize this period.

If you think a request has been misflagged: run /feedback in Claude Code, click the down - thumb icon on the fallback prompt at http://Claude.ai or Cowork, or fill out the security restriction appeal form for API requests. Your reports help us adjust these classifiers. Thank you for your feedback.

However, users' trust has been damaged. Now, even though Anthropic has apologized and promised to revoke the policy, many people have expressed their distrust on social networks.

Some people think that Anthropic may still secretly implement this policy, after all, it is very difficult to detect.

Meanwhile, OpenAI, the competitor, is taking a different approach: considering significantly reducing the token price in order to compete with Anthropic for customers.

Anthropic has recently surpassed OpenAI in terms of revenue, valuation, and in some areas (such as coding tools). Both companies are preparing for an IPO, and high computing costs are a common pain point.

Meanwhile, yesterday, the friend - invitation function of OpenAI's Codex also started a gray - scale test. It is said that inviting friends can reset the quota.

The mutual pressure between the two companies may bring some unexpected benefits to users.

This article is from the WeChat official account "Almost Human" (ID: almosthuman2014). The author is Almost Human. It is published by 36Kr with authorization.

该文观点仅代表作者本人，36氪平台仅提供信息存储空间服务。

Just now, Anthropic apologized.