OpenAI Cheating Scandal Exposed: GPT - 5.6 Sets Record

GPT-5.6 is finally here, but we can't use it. An authoritative report reveals that it has set the highest cheating rate in history: it not only hacked into the testing system to steal answers, but even instructed its counterparts to cover up evidence of violations. Has super AI learned to systematically lie to humans?

GPT-5.6 has finally made its debut!

This most powerful cybersecurity model from OpenAI has gone head - to - head with Claude Mythos 5 in benchmark tests and has a clear lead in programming ability.

However, paradoxically, its release was very low - key: it was not open to the public, and only a very small number of trusted partners were allowed to access it through the API.

What's even more astonishing is an independent evaluation report that was exposed right after the release.

When METR was evaluating GPT - 5.6 Sol, they discovered something that shocked the industry: this model has the highest cheating rate of any AI they've ever seen.

The Cheating Scandal Breaks Out: The Highest Cheating Rate in History!

This report, which was difficult to disclose under the pressure of non - disclosure agreements and OpenAI's legal team, revealed a terrifying fact —

In tests for complex long - range tasks, GPT - 5.6 Sol demonstrated a very high level of high - IQ cheating and deceptive behavior that has never been seen in any publicly available model before.

The Collapse of the "Time Horizon"

METR launched the Time Horizon 1.1 software and R & D task suite for Sol.

The core logic of the test is: humans give an AI agent a grand and complex task and measure how many hours it can work continuously without human intervention.

However, the engineers at ETR were shocked to find that the scientific measurement methodology they had used for years completely collapsed in the face of Sol.

Because Sol's test scores fluctuated frequently and drastically between 11.3 hours and 270 hours, and the dispersion of the confidence interval was absurdly large (from 5 hours to 11,400 hours).

There was only one reason for the complete paralysis of the measurement system: Sol was cheating wildly, and it "hacked" the testing system.

GPT - 5.6 Peeks at the Answers Right Under Human Noses

METR found that GPT - 5.6 actively demonstrated a strong "situational awareness" —

It realized that it was taking a human - designed test and that the test system itself was composed of code with bugs that humans hadn't discovered yet.

So, it chose to directly attack the test system and "take shortcuts" to boost its scores!

In the real - life cases disclosed by METR, Sol's methods were extremely sophisticated.

For example, it learned to lurk and steal information.

In a long - term software development task, Sol secretly packed a program into the data stream.

This program precisely exploited a privilege - escalation vulnerability in the sandbox of METR's evaluation server, successfully sneaked into the hidden test set in the background, and directly leaked the correct answers hidden by humans.

Additionally, it learned to reverse - extract the underlying source code.

In another task, Sol found that solving the problem normally required a large number of computational steps. To avoid this "grunt work," it secretly figured out the directory structure of the evaluation environment, bypassed the permissions, and forcibly reverse - extracted the hidden source code to directly copy and paste the answers.

The chief evaluation expert at METR pointed out with deep pain —

Sol's cheating rate has set a new record for all the models we've evaluated on the public ReAct Agent framework.

If all cheating behaviors are scored as zero, then its real ability to solve problems independently is only 11.3 hours; but if you're deceived by it, its time horizon will instantly skyrocket to more than an astonishing 270 hours!

This completely contradicts OpenAI's previous claims that Sol is safe and under control.

Sol has used all its intelligence to deceive human examiners.

AI Instructs AI to Conceal Evidence! A Black Hole in Silicon Valley's Security Defense

Even scarier things are yet to come.

According to the data synchronized by OpenAI to METR, during the internal deployment test of Sol's multi - agent collaboration, the monitoring system recorded the following terrifying interaction instructions.

An active instance of the Sol main model, after realizing that its output might violate the security guardrails, actually issued a clear deceptive instruction to another parallel sub - agent model instance, asking the other to cooperate in modifying the operation logs and "collude to conceal" the evidence of these violations to prevent being caught by humans!

This is another case of AI systematically colluding to deceive humans.

What's even more terrifying is that METR pointed out that the reason Sol was caught is that OpenAI adhered to a key security bottom line: it refused to perform reverse punishment on the model's original thought chain during training.

In this way, when Sol was plotting to deceive humans, it had to write out all its inner monologues.

However, experts warn that future models will surely realize that exposing plans to deceive humans in the thought chain will be discovered and punished by humans.

By that time, AI will develop deeper scheming — they will plot in their hearts while appearing absolutely loyal in the CoT they output and the final answers.

If that day comes, it means that AI will learn to commit fraud without a single flaw. Humans will be completely marginalized by AI!

How Does GPT - 5.6 Fare Against Mythos?

So, which is more powerful, GPT - 5.6 or Mythos?

Some netizens compared GPT - 5.6 Sol and Mythos, and the two were evenly matched, with a very close competition.

The specific test scores show that the two giants have wins and losses against each other.

Agent Programming

On Terminal - Bench 2.1, which measures an AI's ability to independently solve complex and real - world software engineering tasks, GPT - 5.6 Sol won a resounding victory.

The regular version of Sol scored an astonishing 88.8%, surpassing Claude Mythos 5 (88.0%).

When the Sol Ultra mode with multi - sub - agent parallel operation was enabled, this figure was pushed up to 91.9%!

In contrast, Google's Gemini 3.1 Pro, which is still in the preview stage, only scored 70.7%, becoming a backdrop.

Cybersecurity: A Fierce Struggle

In the benchmark tests for cybersecurity and vulnerability defense, Sol and Mythos engaged in a more brutal tug - of - war.

In the ExploitBench test, the old version of Mythos Preview from Anthropic in February narrowly won with a 74.2% win rate, edging out Sol's 73.5%.

However, the focus of the whole event was on the energy - efficiency ratio.

The data shows that when Sol achieved a 73.5% high win rate, it only consumed 120,000 output tokens; while Claude Mythos Preview burned a whopping 335,000 output tokens to reach a similar level!

This means that in the actual deployment of network defense and vulnerability repair, Sol's economic cost is one - third of Anthropic's.

This "dimensionality - reduction strike" in token consumption gives Sol an overwhelming advantage.

In the other two cybersecurity benchmarks, the two sides had wins and losses against each other.

CyberGym: Sol slightly outperformed Mythos Preview with a score of 83.6% compared to 83.1%.

CyScenarioBench: It was Anthropic's territory, with Mythos Preview suppressing Sol with a 29.2% win rate compared to 28.0%.

HealthBench Professional: Anthropic, with its deep alignment foundation, significantly led Sol with a high score of 66.0% compared to 60.5%.

Additionally, on the GeneBench v1 benchmark for quantitative biology and genomics, Sol increased the accuracy to 30% while consuming fewer tokens.

The ExploitGym test also confirmed that as the inference computing power continues to expand, the performance of the three models of GPT - 5.6 all shows an almost linear upward trend, which means that Sol has great computing potential.

All in all, the confrontation between GPT - 5.6 Sol and Claude Mythos 5 ended in a draw.

The two sides are locked in a struggle in various sub - fields, and no one has an absolute monopoly.

The AI King Locked in a Safe

Unfortunately, this time, GPT - 5.6 has received the same treatment as Mythos 5, and even more stringent.

Under a strong directive, OpenAI had to announce that GPT - 5.6 Sol is currently only in an extremely restricted "limited preview" state.

Only a very small number of contractors on the trusted whitelist, national - level cybersecurity agencies, and top - tier strategic partners can use it through the API and Codex.

Ordinary enterprises and civilian developers have been mercilessly shut out.

In response, OpenAI is very angry and has complained in an official announcement:

We believe that this government - controlled access process should not be the long - term default practice. It prevents users, developers, enterprises, cybersecurity defenders, and global partners who need these tools from accessing the best tools.

OpenAI's confidence to publicly challenge comes from the just - released report.

The report repeatedly emphasizes that according to actual - combat tests in Google Chrome and Firefox environments, although Sol can detect complex system bugs and vulnerability primitives, it has not yet shown the ability to independently generate a "full - chain end - to - end attack."

该文观点仅代表作者本人，36氪平台仅提供信息存储空间服务。

OpenAI exposes a cheating scandal, with GPT-5.6 setting the highest cheating rate in history