After burning 400 tokens in just two hours, I finally realized how powerful Claude Fable 5 is
After Claude Fable 5 was released on June 10th, Zhiwei observed numerous cases on Twitter. What impressed me most was not those UI design or physical simulation cases, nor the codebase with 50 million lines of code migrated within a day (after all, this is beyond personal cognitive scope). Instead, I was attracted by a seemingly simple example.
Source: https://x.com/ProperPrompter/status/2064405487492452856
Prompt:
Use SVG to simulate pixel art and create a beautiful and detailed cute animal scene. The size of each "pixel" should be the same.
Although there have been numerous attempts to make large models draw with SVG, this was the first time I felt that the paintings created by AI were so natural and harmonious. Whether it was the shape of the animals, the atmosphere of the environment, or the combination of colors, it made you feel that Claude Fable 5 was painting while looking with its "eyes".
After seeing the following case and its achievement of quickly completing "Pokémon: FireRed", my imagination that Claude Fable 5 might possess some kind of 3D visual thinking or "spatial intelligence" became even stronger.
Therefore, in this evaluation, Zhiwei intends to focus on testing Claude Fable 5's ability to build visual concepts with code. (The entire process uses Claude Code for testing.)
Although netizens seem more enthusiastic about making Claude Fable 5 directly build "Minecraft", most of its components are ordinary terrain elements. To verify the existence of the model's visual thinking, it must be made to realize some non - ordinary concepts and properties, such as images with IP attributes, secondary combination and innovation, and instant understanding of original designs.
Before the official start, it's still necessary to use a relatively complex case to test Claude Fable 5's basic programming ability.
In a 3D engine case that has challenged models like Gemini 3 Pro, Claude Fable 5 has given the best answer so far. Apart from fully realizing the requirements and having no bugs, it is the only AI that won't miss the left - hand template library.
This is just an appetizer. After all, the web - version Claude Sonnet 4.6 (low effort) can also basically complete this case.
Next, it's time to test Claude Fable 5's visual understanding and building ability.
I asked Claude Fable 5 to directly use the 3D engine just written to build a 3D model of Doraemon, and the result was truly perfect.
Then, I asked for Chopper, and the result was full of more pleasant surprises than flaws.
Continuing, I added Luffy, placed him behind them, and emphasized his Gear 3 form. Claude Fable 5 well understood the huge arm form of Luffy in this state.
Finally, I hoped to make the scene more rich, so I asked Claude Fable 5 to draw Luffy's pirate ship, the "Going Merry", and let the three of them stand on the deck.
The result was not ideal. The huge pirate ship was drawn by Claude Fable 5 as a small boat that can only be used in the scenic lake. Of course, the model deliberately restored the sheep - head logo on the bow and the pirate flag, which was quite detailed.
After completing the above tests, Claude Code consumed 43% of the 5 - hour quota and Tokens worth $7.29. The price is really high. To have a good time, the Pro - level subscription may not be enough.
The "collapse" of the "Going Merry" may be due to the small workspace, which made it difficult for Claude Fable 5 to perform.
Next, we break the limitations of the engine framework and start building more complex objects. Let Claude Fable 5 directly use Three.js to build the Eldia Kingdom in the style of "Minecraft", that is, the castle built with three circular city walls as the framework in "Attack on Titan".
Prompt:
You will use Three.js to build a first - person voxel (Minecraft - like) sandbox prototype. You can freely organize the project and introduce dependencies and post - processing.
Goal: Implement an interactive voxel world centered on the "Three - layer Castle of the Eldia Kingdom" in "Attack on Titan".
Core scene: The three - layer castle of Eldia. The core of the world is a huge "Paradise Island - style royal capital castle" with a three - layer city wall structure: Maria, Rose, and Sina.
Core experience: The player is born above the outermost city wall (Maria), can walk around the city wall, fall from the city wall to the ground, and climb from the ground to the city wall. The world is a procedurally generated voxel terrain with villages, castles, rivers, grasslands, and forests.
The gameplay basically retains the classic feel of "Minecraft": third - person view, WASD + mouse, left - click to destroy, right - click to place, with an inventory. The rest of the details are up to you. The first sight should shock you with the city walls and the sunset.
If the implementation is successful, it should meet the following requirements:
The player can see the obvious layered structure of the three - layer city walls after entering the world;
Can move between different layers (stairs/ropes/ground);
The first layer is complex, the second layer is regular, and the third layer is magnificent;
Can freely destroy/place blocks;
The castle structure is visually "readable" (you can see the three - layer power structure at a glance).
There are not many requirements for the prompt. The key is to emphasize the goal and acceptance criteria rather than the process.
During the execution process, Claude Fable 5 will continuously call the Chrome CLI headless screenshot multiple times to view and test the current implementation effect. It really seems like it is "drawing, observing, testing, and thinking at the same time".
However, using the Chrome CLI headless screenshot may trigger problems such as Mac permission restrictions, which caused the progress to be stagnant. Referring to ChatGPT's suggestion, I changed the original plan to the Playwright plan (Playwright is an open - source browser testing and web scraping automation library) and successfully completed the project.
Let's take a look at the effect:
At first glance, it is very amazing. One shot can directly present the visual effect of the three huge city walls under the sunset. The vertical stripes on the city walls are very in line with the characteristics of the original work. You can even find that the soldier as the protagonist can be identified as from the Survey Corps because of the green cloak.
Of course, the complexity of this result is still far from that of human "Minecraft" works. For example, the following picture was created by John Papadopoulos, the founder and editor - in - chief of DSOGaming.
You should know that Claude Fable 5 has only completed the macro framework of the Eldia Kingdom. The villages and forests on the plain are too messy and random. There is not even a trace of the most core residential area, that is, the barbican (the city structure shown in the above picture). The spacing between the city walls is too narrow, and there is no "epic" atmosphere.
Of course, on the other hand, the finished product built by Claude Fable 5 is at least not identical to any relevant works I searched for. So, at present, the probability that it is based on its own understanding rather than copying the training data is relatively high.
Next, it's time to increase the difficulty, mainly to fix the above - mentioned flaws.
First, adjust the macro size.
Prompt:
Please adjust the size so that the ratio of the human height to the city wall height is 1:50, and the ratio of the city wall height to the distance between adjacent city walls is 1:20. There should also be a semi - circular barbican wall slightly protruding outward at each city gate.
After analyzing the requirements, Claude Fable 5 believes that too large a spacing will make it impossible to display the three city walls on the same screen visually, and it needs to be rewritten as a streaming generation. It will explain later that this is based on the consideration of visual rendering efficiency. It actually provided me with three completely different options: either keep the current spacing, or implement an extremely large spacing, or make a compromise.
After all, with the extremely large spacing as originally prompted, it is impossible to display them on the same screen, which greatly affects the visual atmosphere, and the walking time from one wall to another is too long. So, I finally chose the compromise option.
The compromise option is also good. You can see the three city walls at a glance.
At first glance, you may think that Claude Fable 5 is being lazy by using a smooth, unstructured wall surface, and there are inexplicable gaps on the city wall. But after getting closer, you can understand its painstaking efforts.
As you get closer to the city wall, the real structure of the wall surface gradually emerges, and the original gaps are filled.
I asked, "Why does expanding the radius of the city wall require a large amount of work? What are the specific work contents and challenges?"
Claude Fable 5 explained: Generation is not expensive; "letting you see" is expensive.
This should mean that if all the details in the scene are presented to you at once, the memory usage will be too high, and it will not run smoothly. So, generally, detailed rendering is only done for the close - up view, and rough rendering is done for the distant view. This is the so - called streaming generation or streaming rendering. This is actually a common optimization method for many games, especially open - world games.
The key is that it also