Claude drops a bombshell late at night, unveiling Fable 5, the most powerful "danger-level" model ever, at an unbelievable price.
Early in the morning of June 10th, Beijing time, Anthropic released its most powerful models, Claude Fable 5 and Mythos 5, without any pre - warming. The former is open to the public, while the latter remains in a controlled project like Project Glasswing.
Fable is translated as "fable". Judging from the name alone, Fable 5 seems to be just another new member of the Claude product line. However, according to Anthropic's own statement, Fable 5 belongs to the Mythos - class model. It is the public - version Mythos that they finally dare to offer to ordinary developers and enterprises. And Mythos is translated as "myth".
(Image source: Anthropic)
Why is it said that "they finally dare to release it"? The name Mythos has almost been synonymous with "dangerous" in the past two months. In April this year, Anthropic launched Project Glasswing and handed over the Claude Mythos Preview to a few security partners such as AWS, Apple, Cisco, CrowdStrike, Google, Microsoft, NVIDIA, Linux Foundation, and Palo Alto Networks to find and fix critical software vulnerabilities. At that time, Anthropic's attitude was very clear. The Mythos Preview was not to be widely available. The reason was simple: its network security capabilities were so strong that it could be misused.
(Image source: Anthropic)
The official stated bluntly that Mythos has discovered a large number of high - risk vulnerabilities, including long - standing issues in major operating systems, browsers, and critical software that no one had noticed before. In the hands of defenders, it is a security tool; in the hands of attackers, it could become the next - generation automated vulnerability digger. Therefore, Mythos was confined to Project Glasswing.
It was only just now that Anthropic finally released this model. Anthropic added a security classifier to Fable 5. High - risk requests may be refused, or it may fall back to Opus 4.8. In simple terms, they put guardrails on a model that couldn't be directly released before and then launched it into the market. Lei Keji AGI (ID: leikejiagi) stayed up late to compile some information about this model, hoping it will be useful to you.
Company A Creates Another Model "Myth", Fable 5 Becomes the Strongest Without a Rival
Fable 5's scores seem extremely impressive. On SWE - Bench Pro, it scored 80.3%, higher than the 77.8% of Mythos Preview, 69.2% of Opus 4.8, 58.6% of GPT 5.5, and 54.2% of Gemini 3.1 Pro. Judging from this single item alone, it is already the most prominent in the first echelon.
The really astonishing part is in FrontierCode Diamond. This evaluation is closer to real - world software engineering. It examines whether the model can write code that maintainers are willing to accept. Fable 5 scored 29.3%, while Opus 4.8 only got 13.4%, and GPT 5.5 only 5.7%. This is not just about winning a few more percentage points. The previous - generation Claude and the main competitors have been left far behind.
In the past, many AI programming models could write code, but the engineering quality was often unstable. Some code could run but was difficult to maintain; some code could pass tests but still caused problems when put into real projects. The harshness of FrontierCode lies here. It cares about whether the model has engineering taste and can handle long - term tasks in complex code repositories. Fable 5's significant lead over Opus 4.8 here shows that what Anthropic has truly upgraded this time is the soul of agent coding.
(Image source: Anthropic)
On Terminal - Bench 2.1, Fable 5 scored 88.0%, Opus 4.8 scored 82.7%, GPT 5.5 Codex CLI scored 83.4%, and Gemini CLI scored 70.7%. This means that Fable 5 has outperformed OpenAI's Codex CLI combination in performing tasks in the terminal environment, reading error reports, modifying code, and continuing to progress.
Score is not that important. The truly scary thing about Fable 5 is that it is already like a model that can work on the engineering site. You give it a task, and it can read the project, break down the task, adjust tools, fix errors, and keep going. Anthropic's press release mentioned that Stripe used Fable 5 to migrate a 50 - million - line Ruby codebase, compressing what originally took a team two months into one day. Even if such cases have a marketing element, it can't stop AI coding from moving from assisting in writing functions to taking over the engineering process.
Let's make an inappropriate comparison with DeepSeek V4 - Pro Max. It scored 90.1% on GPQA Diamond, 93.5% on LiveCodeBench, and 80.6% on SWE Verified. This is already a very competitive result in the open - source camp. Qwen3.7 - Max has also made its mark in areas such as GPQA, SWE Verified, and Terminal - Bench. For readers familiar with DeepSeek, this means that domestic and open - source models are not weak, and many traditional strong benchmarks are already approaching the strongest closed - source models.
(Image source: Chart made by Lei Keji)
However, when it comes to indicators closer to real - world engineering and long - task execution, the pressure from Fable 5 suddenly becomes stronger. On SWE - Bench Pro, Fable 5 scored 80.3% ; the SWE Pro score of DeepSeek V4 - Pro Max in the official table is 55.4%; on HLE with tools, Fable 5 scored 64.5%, while DeepSeek V4 - Pro Max scored 48.2%; although the versions of Terminal - Bench are not exactly the same, Fable 5 scored 88.0% on version 2.1, and DeepSeek V4 - Pro Max scored 67.9% on version 2.0. Fable 5 leads by a huge margin in all aspects.
These numbers may not fully illustrate the problem, but the direction is clear. DeepSeek is strong in cost - effectiveness, open - source, and a batch of traditional ability indicators. Fable 5 is strong in the most expensive and hardest - to - sell tasks, especially long - task agents, complex engineering, tool collaboration, and real - codebase processing.
Visual and spatial reasoning abilities have also increased significantly. For example, in visual tasks for knowledge work such as GDP.pdf, Fable 5 scored 29.8%, higher than Opus 4.8, GPT 5.5, and Gemini 3.1 Pro. On Blueprint - Bench 2, Fable 5 scored 38.6%, slightly higher than GPT 5.5's 36.2% and far higher than Opus 4.8's 14.5%. This explains why Anthropic emphasizes that Fable 5 can reconstruct web applications from screenshots and extract precise numbers from scientific charts.
With Fable 5, handling multi - modalities such as pictures and videos is more like connecting screens, charts, interfaces, and code into a complete task chain. When it understands a page, it has the opportunity to directly replicate the page; when it reads a picture, it can also turn the structure in the picture into the next step of operation.
What makes Anthropic hesitant to fully release Fable 5 are its network security and biological capabilities. On ExploitBench Cap%, Fable 5 scored 78.0%, Mythos Preview scored 69.0%, Opus 4.8 only scored 40.0%, and GPT 5.5 scored 34.0%. This gap is extremely large. In security defense, it means that the model can help enterprises and open - source maintainers discover vulnerabilities more quickly; in the hands of the wrong people, it will also lower the attack threshold.
(Image source: Anthropic)
On BioMysteryBench hard, Fable 5 scored 46.1%, higher than Mythos Preview's 29.6% and Opus 4.8's 40.0%. Anthropic also mentioned that Mythos 5 brought about a ten - fold acceleration in drug - design - related processes, and the proportion of molecular biology hypotheses preferred by researchers in blind tests was about 80%. This sounds like good news for scientific research, but it is also enough to make regulators nervous.
Therefore, Fable 5's strength does not only come from being "smarter". It is strong in long - tasks, engineering delivery, visual understanding, and high - value and high - risk professional scenarios such as security and scientific research. In a sense, it is currently the most powerful model that Anthropic can make publicly available to the public, without a rival.
When Everyone Is Cutting Prices, Anthropic Sells AI as a Luxury
No matter how strong Fable 5 is, it can't avoid a real - world problem: it is outrageously expensive. The official price is $10 per million input tokens and $50 per million output tokens. In comparison, Claude Opus 4.8 costs $5 for input and $25 for output, and the price of Fable 5 is directly doubled.
What's more embarrassing is that its release time coincides with the price war of large models. The current API price of DeepSeek V4 - Pro has reached $0.435 per million input tokens and $0.87 per million output tokens. The price of V4 - Flash is even lower, with $0.14 for input and $0.28 for output.
Xiaomi's MiMo - V2.5 series also completed a permanent price cut at the end of May. The overseas version, MiMo - V2.5 - Pro, also costs $0.435 for input and $0.87 for output. The official also emphasized that the maximum price cut can reach 99%. On Google's side, there are still many low - price models available for the Gemini API. Gemini 3.5 Flash costs $1.5 for input and $9 for output. In terms of subscriptions, Google also reduced the price of its top - tier AI Ultra package from $250 to $200.
(Image source: Chart made by Lei Keji)
That is to say, while the industry is pushing the prices of 1M context, agent coding, and multi - modality capabilities down to the low - price range, Anthropic sets the price of Fable 5 at $10 for input and $50 for output. Compared with DeepSeek V4 - Pro and MiMo - V2.5 - Pro, the input price of Fable 5 is about 23 times higher, and the output price is about 57 times higher. Even compared with Gemini 3.5 Flash, it is several times more expensive. This price is enough to deter a large number of ordinary developers.
However, Anthropic's plan is clear. It doesn't want Fable 5 to do what cheap models can do. There is no need to use Fable 5 for daily Q&A, lightweight writing, or ordinary code completion. It sells time in high - value tasks such as large - scale codebase migration, long - context document analysis, complex enterprise processes, network security defense, and scientific research hypothesis generation. To put it bluntly, if you think your time is more valuable, then choose Fable 5.
If a model can really compress a two - month project into one day, it has the confidence to be expensive. However, when enterprises make a purchase, they will first do some calculations. For example, the model price is just the first layer, data retention is the second layer, and compliance is the third layer. Fable 5 is listed as a Covered Model. The Claude API requires 30 - day data retention and does not support zero data retention (ordinary data retention). For financial, medical, legal, and core R & D teams, this is not a trivial matter.
(Image source: Anthropic)
Moreover, Fable 5 has another troublesome point. It will automatically trigger a security review on sensitive issues such as network security and biology. For some questions, it will directly refuse to answer; for others, it will use the less - powerful Opus 4.8 to answer. For ordinary users, this may just be "being refused while asking questions", but for enterprises, it will become an engineering problem.
This forms a very interesting situation with two camps. DeepSeek, MiMo, and Gemini are proving that strong models will become cheaper and easier to be called on a large scale by developers and enterprises. Anthropic, on the other hand, is proving that truly top - tier models that are close to the core of productivity may become more expensive and more like luxury - level infrastructure.
But which camp will be the real future? No one can be sure.
Fable 5 Puts So Much Pressure That Competitors Are Having a Hard Time
The release of Claude Fable 5 will make many companies uncomfortable. OpenAI will be uncomfortable because Anthropic continues to make its mark in agent