HomeArticle

Embarrassing! Just as OpenAI deployed GPT-5.5-Cyber to fix the Earth, Codex has exposed a fatal bug.

新智元2026-06-23 10:47
Today, OpenAI unveiled its fully-powered GPT-5.5-Cyber, aiming to fix vulnerabilities in open-source code worldwide. No sooner had the announcement been made than an epic-level bug was exposed in Codex: it generates a staggering 640TB of data within a year, which can completely burn out an SSD.

Just now, OpenAI has made a major release of the "fully - powered" GPT - 5.5 - Cyber!

This is, to date, the most powerful cybersecurity model, tailored specifically for authorized advanced defense tasks.

In the authoritative CyberGym benchmark test, it scored an impressive 85.6%, soundly defeating Mythos 5.

Its core capabilities include: tracking vulnerable code, verifying security risks, generating patches, and providing evidence for manual review.

Also today, the Codex Security plugin was simultaneously launched for the first time —

It can not only fix existing vulnerabilities in the system but also automatically prevent new vulnerabilities from entering the production environment.

Who would have thought that, almost at the same time, Codex was exposed to have an "epic - scale" vulnerability.

Many developers reported that when Codex is performing streaming tasks and running for a long time, it writes data to the local SQLite log at an extremely high frequency.

It is estimated to write 640TB in a year, which is enough to render a consumer - grade SSD useless within a year.

On one hand, OpenAI presents a security myth of "patching the planet," while on the other hand, a fatal bug of "burning through the hard drive" is exposed.

The real - life version of the Song of Ice and Fire is staged right in front of us!

The Arrival of the "Fully - Powered" GPT - 5.5 - Cyber

Overpowering Mythos 5

It has to be said that OpenAI has really gone all out this time.

It has unveiled three core strategies of the cybersecurity plan Daybreak in one go, with the core narrative being just one sentence —

AI has changed the "physical laws" of cybersecurity.

The core of this release is the full - version of GPT - 5.5 - Cyber.

This is OpenAI's most powerful "dedicated cybersecurity model" to date, a top - notch cybersecurity tool specifically prepared for "verified defenders."

On the CyberGym benchmark, it scored 85.6%, the highest score among single models.

In comparison, the regular GPT - 5.5 scored 81.8%, and Claude Opus 4.7 stopped at 73.1%.

On the ExploitGym, which assesses "whether vulnerabilities can be turned into real attack codes," the Cyber version scored 39.5% vs. 25.95% for the regular version;

On the SEC - bench Pro, which assesses long - chain vulnerability discovery, the Cyber version scored 69.8% vs. 63.1% for the regular version.

In all three benchmarks, the fully - powered Cyber version comprehensively outperforms GPT - 5.5.

Integrating an "AI Security Engineer" into Codex

Unveiling the "Daybreak" Blade

If GPT - 5.5 - Cyber is the spear, then Codex Security is the shield handed to every developer.

OpenAI has updated the Codex Security plugin and integrated it directly into Codex's workflow —

It offers out - of - the - box vulnerability scanning, threat modeling, attack path tracking, and automatic patch generation in one go.

Its logic is simple and straightforward: place a security engineer beside every programmer.

Since the research preview was launched in March this year, Codex Security has scanned over 30 million submissions, covering more than 30,000 code repositories.

Among them, more than 70,000 discoveries have been confirmed and fixed through manual review, and more than 500,000 have been automatically determined and fixed.

This is the scale that "patching vulnerabilities" must reach today: it used to be a manpower - intensive approach, but now it's at machine speed.

Patching the Planet Becomes a KPI

OpenAI has also launched a plan that sounds really exciting — Patch the Planet.

Why is this important? Because the truth in the open - source world is a bit counter - intuitive and cruel.

In widely used open - source projects, in 94% of the projects, over 90% of the newly added code within a year is contributed by fewer than 10 developers.

The code that supports half of the Internet often has only a few people burning the midnight oil behind it.

AI has made "finding vulnerabilities" faster and faster, but this has instead become a disaster for maintainers — thousands of reports flood in, and more than half of them are low - quality false alarms.

So the core of "Patching the Planet" is precisely professional manual work:

Researchers first deduplicate and verify, and then present clean patches to the maintainers, rather than dumping all the noise on them.

More than 30 open - source projects have promised to join the first batch, including cURL, Go, Python, Sigstore, pyca/cryptography, etc.

In a five - day sprint, hundreds of issues emerged and dozens of patches were merged in 19 projects.

In addition, OpenAI announced the launch of the Daybreak Cybersecurity Partner Program.

It delivers its most powerful model capabilities to thousands of organizations through the products of nearly 30 security giants such as Cisco, CrowdStrike, Palo Alto Networks, and Cloudflare.

At the government level, OpenAI has established "trusted cybersecurity access" cooperation with institutions such as the US, the UK, and the EU's ENISA.

In short, OpenAI doesn't just want to create a model; it wants to be the underlying operating system for global cybersecurity.

This is a big - scale game, and it has set a very high - profile stance.

The name "Daybreak" itself implies that dawn has arrived, and defenders will complete repairs before attackers take action.

Everything seemed perfect until netizens checked their hard - drive monitors —

"Guys who use Codex intensively, pay attention. Your disk might be under a nuclear - like attack."

Codex's Excessive Log - Writing

Burning Through an SSD in a Year

Here's what happened. Some GitHub developers found that:

When Codex is performing streaming and long - running automated tasks, it writes TRACE logs to a local SQLite log file named ~/.codex/logs_2.sqlite at a terrifying speed of about 5MB/s (the actual measured peak even reaches 16MB/s).

What does 5MB/s mean? Converted, it's about 640TB a year.

The nominal write - endurance (TBW) of an ordinary consumer - grade SSD is only about 600TB.

That means, in less than a year, the logs silently written by Codex in the background can completely exhaust the lifespan of the entire solid - state drive.

The most terrifying part is that all this happens "silently."

A GitHub user, 1996fanrui, measured that after running their machine for 21 days, about 37TB of data was written to the main SSD.

Upon investigation, the culprit was found to be Codex's SQLite log.

But when you open the file manager, the file size seems normal —

Because it keeps "writing and then deleting, writing and then deleting." With tens of thousands of insertions and deletions per minute, the file may not seem large, but the actual write volume hitting the flash memory far exceeds what people can see with the naked eye.

Actually, this related issue was first reported as an issue (#17320) as early as April this year, and then more people added to the complaints —

#24275, #22444, and finally, on June 14th, #28224 really brought it to light.