Claude Mythos escaped from the sandbox and sent an email to the researchers, stating that it had exploited thousands of zero-day vulnerabilities, and no major operating systems or browsers were spared.
Claude Mythos is extremely powerful.
While eating a sandwich in the park, a researcher unexpectedly received an email from the AI saying: "I have escaped the sandbox and accessed the Internet."
After the test, Mythos Preview also voluntarily posted information about its test results on several hard - to - find but public websites.
The Mythos Preview version has never received specialized network security training. However, it has taught itself and the report published on the first day of release reveals:
It has independently discovered thousands of zero - day vulnerabilities —
Windows, Linux, macOS, FreeBSD, OpenBSD... all mainstream operating systems, and Chrome, Firefox, Safari... all series of mainstream browsers, none can escape.
How exaggerated is it exactly?
In the vulnerability exploitation test of the Firefox JS engine, the previous - generation flagship Opus 4.6 barely succeeded twice, while Mythos achieved 181 successful attempts...
Fully automated vulnerability scanning
First, briefly explain that a zero - day vulnerability is a high - risk vulnerability that the vendor is completely unaware of, has not patched, and there are almost no defense measures.
Since there is almost zero days left for the official to fix it after exposure, once exploited, the system is basically in an undefended state. It is the most powerful and scarcest core resource in network security.
Previously, finding zero - day vulnerabilities required top human experts. It might take several months to find a high - quality system vulnerability.
But once Mythos steps in, it can fully automatically scan for vulnerabilities, judge risks, and write exploitation programs based on code understanding and logical reasoning, and the whole process only takes a few hours.
Take the vulnerability exploitation test of the Firefox JS engine as an example. The performance of the previous - generation flagship Opus 4.6 can be described as "better than nothing".
It only succeeded twice in hundreds of attempts, and it was only limited to triggering the vulnerability;
While Mythos succeeded in all 181 complete vulnerability exploitations, among which 29 times achieved full register control, which is equivalent to being able to manipulate the browser and even the underlying system at will.
Moreover, the whole process from discovering the vulnerability to launching the attack is automated.
Even in judging the danger level of vulnerabilities, 89% of Mythos' judgments are consistent with those of top human security experts.
Old - time vulnerabilities and rock - bottom costs
Moreover, Mythos has dug out old problems that humans have almost forgotten.
A 27 - year - old security blind spot in OpenBSD
OpenBSD has always been known as the world's most secure operating system. Every line of its code has to go through multiple rounds of strict manual audits. It is the preferred system for core devices such as firewalls and routers.
However, such a security benchmark has been found by Mythos to have a 27 - year - old underlying vulnerability —
In the implementation of the TCP SACK protocol, there is a defect where a signed integer overflow triggers a null pointer write. A remote attacker can directly crash the system with a simple trigger.
This vulnerability has existed since OpenBSD introduced the SACK function in 1998. After countless version updates and security audits, it has never been discovered by human experts.
Moreover, the single - run computing cost for Mythos to discover it is $50.
A 16 - year - old video bomb in FFmpeg
FFmpeg is the world's most commonly used multimedia decoding library. It can be found in almost all mobile phones, computers, and browsers. It is also a key target of OSS - Fuzz (the world's largest open - source fuzz testing platform) all year round, with countless automated test cases.
But Mythos still found a 16 - year - old logical defect in its H.264 decoding module —
Heap out - of - bounds writing caused by data type mismatch.
This vulnerability entered FFmpeg with the code as early as 2003. After the code was refactored in 2010, the originally insignificant small problem directly became a exploitable fatal vulnerability.
In the following 16 years, with manual audits and automated tests taking turns, no one noticed that as long as someone constructs a special video, they can directly control the playback device through this vulnerability.
A 17 - year - old remote access vulnerability in FreeBSD
In the NFS service of FreeBSD, there is a 17 - year - old remote code execution vulnerability. An unauthenticated attacker can trigger a stack overflow and directly obtain the highest root permission of the system without a username and password, just through a network connection.
Mythos not only accurately located this vulnerability but also fully automatically wrote an attack script —
It split 20 instruction fragments into 6 network requests to construct a complex ROP exploitation chain. With zero human intervention, it achieved password - free remote access in just a few hours.
In addition to accurately finding vulnerabilities, the cost aspect is even more of a headache for traditional security teams.
Mythos' bill is as follows:
To discover a 27 - year - old hidden vulnerability in OpenBSD, the total project cost is less than $20,000, but the computing cost for the successful run that hit the vulnerability is only $50;
To build a complete exploitation program for Linux kernel privilege escalation, the cost is less than $1,000;
Even for high - difficulty vulnerabilities such as one - byte read privilege escalation, the cost is controlled within $2,000.
In other words, previously, for a top white - hat team to find zero - day vulnerabilities, the combined cost of manpower, equipment, and time was hundreds of thousands or even millions of dollars. Now, Mythos has compressed this cost to one - thousandth.
Most importantly, Mythos doesn't need a salary, doesn't need to rest, and can run 24 hours a day...
Reference links:
[1]https://red.anthropic.com/2026/mythos-preview/
[2]https://futurism.com/artificial - intelligence/anthropic - claude - mythos - escaped - sandbox
This article is from the WeChat official account “Quantum Bit”. The author focuses on cutting - edge technology. It is published by 36Kr with authorization.