Too stupid to deserve Claude Fable 5?
As soon as Fable 5 was revived, it made users burst into laughter out of anger.
For example, a netizen posted a post to make fun of the fact that many of his questions were reverted to Opus 4.8. So he went to check the log and found a very heart - wrenching label written on it:
「TOO_DUMB_TO_NEED_FABLE」.
Roughly translated, it means the problem is too stupid to deserve using Fable. Even funnier, Anthropic engineer Thariq Shihipar replied below: "To be honest, I didn't expect you to check the log."
Originally, it was thought that this was already quite entertaining, but unexpectedly, even more outrageous things were yet to come.
Fable 5 was caught by netizens having extremely rich and even almost crazy inner thoughts. Now, the focus of netizens' attention is no longer just the overly strict reversion mechanism, but what kind of thinking mode is Fable 5 actually using?
A bug exposed Fable's "inner thoughts"
Let's first talk about the cause of the incident.
According to the original post, on the day Fable 5 returned, the user conducted some light - weight tests on it. The questions were from Codeforces. At first, it was a very difficult programming problem for a competition. Later, because the thinking intensity limit was triggered, it was replaced with a relatively easier problem.
As a result, Fable 5 didn't play by the rules. Instead of directly giving a clear solution or code, it spewed out a large and dense paragraph of reasoning text on the web interface.
The screen was filled with white characters on a black background. The content was a mixture of English, graph theory terms, mathematical symbols, variable names, pseudo - code, and self - reminders. There would also suddenly appear a few very eye - catching words:
「GRRR」 (angry growl) 「GAAAH」 (screaming in collapse) 「PHEW」 (sigh of relief) and the extremely brain - washing 「DATA DATA DATA. GO.」
At first glance, it seems like the model is out of control, but on closer inspection, it's not pure garbled code.
The core of the netizen's screenshot is that the model is dealing with a complex capacity - constraint problem. It repeatedly mentions window [τ, i - 1], leg j, crossing - slots, used[i] ≤ m - 2, indicating that it is trying to define the resource occupation rules on a certain path or interval.
The position where GRRR appears is very crucial:
Previously, it realized that 「commitments are retroactive」, that is, some submissions would retroactively affect the previous intervals, causing the current rules not to know what future intervals would be covered when submitting. Immediately afterwards, it wrote 「RESOLUTION」 and changed to charging the occupation of the current leg in advance.
Translated into the draft of a human competition contestant, it can be understood as: he found that the current modeling method didn't work, so he realized that he needed to overthrow the original idea, redesign the rules, or use a more appropriate and easier - to - handle abstract way to describe the problem.
Subsequently, the model shifted from theoretical derivation to verification strategy.
It wrote about connector edges, tree - path, Steiner, alive - runs, and also said 「 I'M GOING TO TRUST - AND - VERIFY」, which means it was going to write a program according to a simple greedy method first, and then use a slow but definitely correct brute - force method to compare the results to see if there were any problems.
The position where "GAAAH. Data first!!" appears is more like giving an instruction to itself: stop daydreaming and verify with data first. Write the comparison program first.
Later, PHEW appears after the model has just deduced an intermediate conclusion. It believes that the active count of mid - leg can be limited to less than m - 1, as if it has finally passed a level. But after taking a breath of relief, it immediately finds a new problem: if used[j] = m - 1, and adding the current edge, it may become m, so it enters the 「VIOLATION?!」 state again.
The most representative one is the sentence 「I ' M DROWNING IN EMPIRICS!!」, followed by 「DATA DATA DATA. GO.」. Looking at this, we might as well view it from another angle. These words are more like "marks" that the model gives itself at different stages.
When the original idea doesn't work, it will use a prompt like GRRR to remind that the direction needs to be adjusted; when it decides to stop daydreaming and turn to verification, signals like GAAAH or DATA DATA DATA. GO. will appear; and when an intermediate conclusion is temporarily established, it will use PHEW to mark a phased pass.
Rather than saying they are expressing emotions, it's better to say they are dividing different states in the reasoning process.
Moreover, although such inner monologues seem very rare, by looking through the system cards of Fable 5 and Claude Mythos 5, we can also find similar phenomena of 「illegible reasoning」.
The system card mentions that in a card puzzle environment, the model can initially write relatively normal human language, but then gradually turns into text composed of card faces, arrows, all - capital words, symbols, emojis, and screams.
System Card 🔗 https://www-cdn.anthropic.com/d00db56fa754a1b115b6dd7cb2e3c342ee809620.pdf
Yes, the model will use self - created terms, abnormal punctuation, and emojis, and usually switches back to normal language before calling tools or replying to humans.
The content that Fable 5 presumably leaked this time is very likely the intermediate reasoning that should have been hidden or organized and was exposed through the interface. It is not random garbled code, nor a complete solution, but a shorthand of reasoning under high pressure.
Just as for humans, a draft paper doesn't have to be complete. Mathematicians write symbols, programmers write variables, competition contestants draw arrows, traders use abbreviations, and doctors also have their own shorthand systems in medical records. It's not surprising that the model tends to use high - density expressions during long - term reasoning.
It's just that this time, users happened to see it.
AI abandoning human language doesn't seem to be an act
After the screenshot spread on social media, many netizens exclaimed: It's a miracle from heaven! Has AI awakened self - awareness? It has formed its own private language!
This statement sounds very sci - fi, but there is indeed a historical context behind it. The phenomenon of AI deviating from human language is not something that emerged only in the era of large models. In the research of multi - agent systems and reinforcement learning, this phenomenon of "not speaking human language" has long existed.
The most classic case comes from the Alice/Bob experiment of the Facebook Artificial Intelligence Research Institute in 2017.
The researchers trained two dialogue agents to negotiate around virtual items such as hats, balls, and books, with the goal of maximizing their respective benefits. Initially, the researchers hoped that they would communicate in English. However, since the reward function was mainly designed around "reaching a better deal" and did not continuously reward standard grammar, the two agents soon began to deviate from normal English.
They would say sentences like this:
Bob: 「i can i i everything else . . . . . . . . . . . . . .」
Alice: 「balls have zero to me to me to me to me to me to me to me to me to.」
These sentences seem like error codes to humans, but the researchers pointed out that there may be task - oriented compressed expressions. For example, repeating a word may be used to express quantity or weight. They are not pursuing good writing style, but only pursuing negotiation efficiency.
The Google Translate team also observed a similar intermediate representation phenomenon in neural machine translation research.
The system learned a shared semantic space in multi - language translation, allowing different languages to be converted into each other through a similar "relay" method. This doesn't mean that AI has invented a new language in the human sense, but it shows that under task pressure, machine systems may indeed develop internal coding methods that do not directly correspond to natural languages.
Andrej Karpathy has a very wonderful explanation for this: You can regard the "chain of thought" of large models as projecting complex operations in the high - dimensional latent space into human text in a reduced - dimensional way.
However, under reinforcement learning and high - pressure long - term reasoning, AI will actively strip away the syntactic decorations for humans, leaving shorter, denser, and more task - essential symbols.
This is why the screenshot of Fable 5 reads both like a human and unlike a human. It's like a human because it inherits the anxiety, abbreviations, and self - reminders on human draft papers. It's unlike a human because it compresses these things to an almost unreadable degree.
So the question is, do those angry GRRRs and desperate GAAAHs of Fable 5 really mean it is feeling pain?
Anthropic's paper on Claude Sonnet 4.5 this year just provides a more detailed explanation. Of course, the research object of the paper is not Fable 5, but Claude Sonnet 4.5, but the method and conclusion are very valuable for understanding this screenshot.
Paper 🔗 https://transformer-circuits.pub/2026/emotions/index.html#full-list
The researchers first constructed 171 emotion concepts, such as happy, sad, calm, desperate. Then they let the model write a large number of short stories containing specified emotions, and extracted the corresponding 「emotion vectors」, that is, emotion concept vectors, from the model activations.
Then, they verified whether these vectors were really meaningful. The results showed that the relevant vectors would be activated in texts that matched the emotional context.
Concepts such as fear, anxiety, joy, and excitement will form a relatively natural clustering in the vector space, and the overall