Is AI Already This Powerful? Released and Launched, but Unauthorized for Ordinary Users

Possibly the most powerful AI model in human history

I think that humanity may be being caught up by AI at a speed beyond conventional cognition.

I don't know about your current situation, but at least I can't do without AI at all now. At least 50% of my daily work is completed with the assistance of AI.

Moreover, this proportion is still increasing continuously.

Meanwhile, with the release of new models generation after generation, both my work efficiency and quality, as well as my monthly consumption on Tokens, are growing rapidly.

Last night, I saw a piece of news that Anthropic released a model that even they themselves dare not make publicly available to everyone because it is really too powerful.

The name of this new model is "Mythos", which means "myth" in Chinese.

Currently, it is in the preview version, so the official calls it "Mythos Preview". However, this time it is launched in the form of a project called "Project Glasswing".

I'll talk about this project later.

Last month, an internal document of Anthropic was accidentally leaked, which mentioned that a model larger and more powerful than Opus was being developed, and its code name was Mythos.

Subsequently, Anthropic attributed this leak to "human error" and did not give further reasons.

Now, the model with the code name Mythos has been officially announced.

Although it has been officially announced, it has not been publicly released and launched. That is to say, ordinary users still cannot use it.

The reason is straightforward. Anthropic thinks this model is too powerful and is not suitable for being open to everyone until the security mechanism is in place.

I think this sentence is worth pausing for a second to think about.

Normally, an AI company can't wait to launch a new model as soon as possible to seize the market, but this time Anthropic's approach is obviously a bit abnormal.

In my opinion, it's not that they don't want to release it, but that they dare not.

Because the model called Mythos is really very powerful.

Let's first look at some official test data.

In terms of coding ability, there is a significant gap between Mythos and the currently publicly available most powerful Claude Opus 4.6. In various benchmark tests, Mythos basically defeats Opus 4.6 completely.

In terms of reasoning ability, in the GPQA Diamond (graduate - level scientific Q&A) test, the result is 94.6% vs 91.3%, and Mythos wins.

In the Humanity's Last Exam with and without tools tests, Mythos also wins completely.

In terms of computer operation ability related to Agent, in OSWorld - Verified (autonomous completion of computer tasks), Mythos exceeds Opus 4.6 with 79.6% compared to 72.7%.

In every dimension, Mythos is stronger than Opus 4.6, and in some cases, it even surpasses it in a crushing manner.

In some task performances, the gap is not a small step of iteration, but a significant leap. For example, in the SWE - bench Multimodal, it jumps from 27.1% to 59%, almost doubling.

The most core reason why they dare not launch Mythos is that its ability to break through the security defense line of the software world is too strong.

To put it bluntly, all systems and software in the world have vulnerabilities, and Mythos can discover and attack these vulnerabilities at a level beyond that of humans.

Suppose this ability is mastered by hackers, then all operating systems and software in the world will suffer, especially some public infrastructure and national security.

There is a sentence in Anthropic's announcement that gives me the creeps after reading it.

Translated, it is: "The coding ability of AI models has reached an extremely high level. In terms of discovering and exploiting software vulnerabilities, they can almost surpass everyone except the most skilled humans."

I want to expand on this sentence a bit.

I'm a programmer by background, so I know how software is built and how different the codes written by different people are.

Moreover, no software can claim that it has no vulnerabilities, even if the vulnerability has never been discovered.

The reason why previous vulnerabilities could lie quietly in the system for decades is not because the system is secure enough.

It's because finding vulnerabilities requires extremely high professional ability, great patience and energy, as well as a lot of time.

There are too few people who can do it, and even fewer people are willing to invest in it.

This "scarcity of ability" constitutes the implicit premise of the entire software security world. After the intervention of AI, this premise begins to shake.

AI can work in a way that exceeds the abilities of most non - top humans. We can use it to attack vulnerabilities, and of course, we can also use it to plug them.

To solve this problem, let me talk about what this Project Glasswing launched by Anthropic is.

To put it simply, this is a project that uses the capabilities of Mythos to find bugs in the world's infrastructure systems.

The participating parties include 12 institutions in total, such as AWS, Apple, Microsoft, Google, Nvidia, Cisco, and the Linux Foundation.

This lineup covers cloud computing, operating systems, chips, browsers, financial infrastructure, network security, and the open - source ecosystem.

In other words, almost all the core participants in the global digital infrastructure are in this project.

The core logic of this project is only one thing: let the defenders use the capabilities of this top AI model first.

Because if the attackers get the same - level tools first, it will be very difficult to close the window once it is opened. Anthropic promises to provide a model usage quota of $100 million, covering the research preview period.

In addition to the 12 core institutions, more than 40 organizations maintaining key software infrastructure have also obtained access rights and can use Mythos to scan their own systems and open - source projects.

At the same time, Anthropic donated $2.5 million to the Linux Foundation and $1.5 million to the Apache Software Foundation, both of which are the infrastructure of the software world.

So to speak, the various apps, websites, and systems we use now are basically based on them as the underlying architecture.

In my opinion, Anthropic has done a good thing this time. It not only launched a more powerful model but also spent money on the global information infrastructure to let them improve themselves.

After all, it won't do any good to anyone if we go in without proper preparation.

Maybe you still can't feel how powerful Mythos is. I saw three specific cases from the official text, and I think they can illustrate the problem better than numbers.

First, OpenBSD.

This is an operating system recognized for its extremely high security. Many key infrastructures run on it, including the iOS system of our Apple phones, the Android system, and even some internal systems of enterprises and institutions.

Mythos found a vulnerability that had existed for 27 years in it. As long as an attacker connects to the target machine, they can make it crash remotely.

27 years! It's not that no one cared, but that no one could find it at all.

Second, FFmpeg.

Almost all software that needs to process videos depends on it. You can find its presence in various video playback software you use.

There is a vulnerability hidden in a line of code written 16 years ago. The automated testing tool attacked it repeatedly 5 million times, but never found it each time.

However, Mythos found it.

Third, the Linux kernel.

There's no need to say much about this. It can basically be said to be the infrastructure of the entire Internet and is also the most worthy of vigilance.

Mythos not only found several independent vulnerabilities but also connected multiple vulnerabilities into an attack chain.

Starting from the ordinary user permissions, it gradually escalates the privileges and finally achieves full control of the entire machine.

This case about Linux is completely different in nature from the previous two cases.

Finding vulnerabilities is an analytical ability.

But connecting vulnerabilities is a strategic ability.

Just like many product managers, they can draw prototypes, write documents, and do data analysis. These are single - point abilities. But connecting business, products, and commerce is a strategic ability.

A model that can plan an attack path is no longer just an auditing tool. It is more like an intelligent agent that can take active actions in the digital environment.

In the above three cases, Anthropic adopted the approach of discovering first, reporting first, repairing first, and then disclosing. All of them have been repaired at present.

After seeing this, you'll know how powerful Mythos is. It's like a wild beast that is temporarily not dared to be released from the cage. The real world needs to be prepared to adapt to it first.

I want to share a few observations here, which may also be the beginning of the real changes in the future.

First, the security assumptions in the software world are becoming invalid.

The software stability we are used to today does not entirely come from a well - designed system. To a large extent, it depends on the scarcity of attack capabilities.

To put it bluntly, it's not that the software is strong enough, but that people are not strong enough.

Finding vulnerabilities requires cost, constructing an exploitation chain requires time, and large - scale scanning requires resources. So many technical debts, old bugs, and outdated systems just exist like that and have never been seriously cleaned up.

Just like when we develop a product, we think the logic is closed - loop and there are no problems, but it doesn't mean that everything is really okay. It's very likely that we've reached the limit of our abilities.

The ability demonstrated by Mythos is that the time window from the discovery to the exploitation of vulnerabilities has been compressed from months in the past to minutes.

What does a few minutes mean?

It means that the rhythm of patching and the repair process can no longer keep up with the speed of attacks.

Second, the open - source world will feel the pressure first.

Most modern software today is based on a large number of open - source dependencies. Usually, we can't see them, but once they are penetrated, the entire industry will be affected at the same time.

Some readers may not understand this logic very well. To put it simply, all the software we use now has open - source projects as the underlying layer, and the source code of these projects is visible to everyone.

In the future, when models can continuously and on a large scale scan open - source projects, the pressure level faced by open - source community maintainers will be completely different.

This is also the reason why Anthropic donated money to the Linux Foundation and the Apache Foundation.

It's not about doing charity, but about recognizing that the open - source infrastructure is the most vulnerable and indispensable foundation of the entire digital world in the AI era. They just don't want to be regarded as villains.

Third, humans will be weakened, and AI will start to play games with AI.

In the past, the value of the Internet product security team lay in human judgment, experience accumulation, and in - depth understanding of the system.

In the future, the logic of this will change.

It will be about whose model is stronger, whose tools are connected faster, and who can embed AI auditing into the front end of the development process.

This is not a question of programmers being replaced, but a reorganization of the production mode of the security industry itself.

The good thing is that thousands of high - risk vulnerabilities can be found within a few weeks. The trouble is that the attackers will sooner or later have the same - level tools.

By that time, the security of software products will no longer be a confrontation between humans, but an offensive and defensive game between models.

This time, Anthropic not only released capabilities but also released risks. It may be the kind of honesty that the entire industry most needs to see at this stage.

Everyone is talking about how AI changes work efficiency, and that's okay.

But Mythos also reminds us that the leap in AI capabilities will ultimately spread from the content world to the software world and then to the infrastructure of the entire digital world.

When the content world is rewritten, it affects the traffic logic.

When the software world is rewritten, it shakes the foundation.

At this time, I'm reminded of a line from the movie "2012", which I'll also use as the ending of this article.

"No matter who you are, regardless of race or country, we'll all be the same tomorrow!"

This article is from the WeChat official account

该文观点仅代表作者本人，36氪平台仅提供信息存储空间服务。

Is AI already this powerful? It's released and launched, but ordinary people aren't allowed to use it.