HomeArticle

GPT-5 Solves "Quantum NP Problem", First Paper Sets Academic World Ablaze, Compresses Two-Week Human Effort to 30 Minutes

新智元2025-09-29 11:17
Major milestone

GPT-5 is rewriting the rules of scientific discovery! A groundbreaking paper reveals that GPT-5 has solved the "quantum version of the NP problem" in just 30 minutes, while it would take humans 1 - 2 weeks. At this rate of development, AI is really not far from achieving a "Nobel Prize-level" breakthrough.

A few days ago, GPT-5 successfully passed the "Gödel test" and cracked the three major mathematical conjectures.

Unexpectedly, this time, GPT-5 has "conquered" a difficult problem in the quantum field.

Quantum computing expert Scott Aaronson published a paper for the first time, proving that GPT-5 helped crack one of the long-standing problems.

In the paper, Scott has been grappling with a core issue in quantum computing - the QMA complexity class, which can be regarded as the "quantum version of the NP problem".

The key lies in whether the error probability in the proof process can be infinitely reduced, especially whether "perfect completeness" can be achieved.

Paper link: https://arxiv.org/pdf/2509.21131

Previous academic research had already reduced the error to a very low level, but the latest research found that the "double-exponential error" is the theoretical limit of the existing methods and cannot be further improved.

After getting stuck in the key derivation step, the author turned to GPT-5 for help. At first, the AI gave a wrong idea.

But after about 30 minutes of interaction, it finally proposed a sophisticated mathematical function that accurately analyzed the behavior of eigenvalues.

The research proves that this idea became the most crucial breakthrough in the paper.

In his latest blog post, Scott exclaimed, "If a student came up with this idea, I'd definitely say - it's amazing!"

It is estimated that this problem would take 1 - 2 weeks of human effort to solve.

OpenAI scientist Sebastien and product manager Kevin excitedly reposted the news again, saying, "A major transformation has begun."

The Quantum Version of the NP Problem: The QMA Singularity

This paper, submitted to arXiv on the 25th, mainly studies the "limitations of black-box amplification in the quantum complexity class QMA".

So, what is QMA?

QMA, or Quantum Merlin - Arthur, can be regarded as the typical quantum version of NP.

It includes a class of decision problems:

If the answer is "yes", Merlin can send Arthur a quantum witness state, which allows Arthur (after polynomial-time quantum computation) to accept with a probability of at least 2/3;

If the answer is "no", no matter what witness state Merlin sends, Arthur's acceptance probability is at most 1/3.

Here, as is common in complexity theory, the constants 2/3 and 1/3 are just conventions and can be replaced by, for example, 1 - 2⁻ⁿ and 2⁻ⁿ through amplification.

In this field, a long - standing question is -

Does QMA equal QMA₁, where QMA₁ is a subclass of QMA that allows protocols to have "perfect completeness"?

In 2008, Scott Aaronson proved through practical analysis methods that there exists a "quantum oracle" such that QMA ≠ QMA₁.

This means that any attempt to prove QMA = QMA₁ requires "quantum non - relativization techniques".

This doesn't mean this obstacle is insurmountable, but it at least shows the complexity of the problem.

Breakthrough: The Limit of Double - Exponential Amplification

Until June this year, Freek Witteveen and Stacey Jeffery published a groundbreaking paper, proving that the QMA protocol can be amplified in a black - box manner, reducing the completeness error to "double - exponentially small", i.e., 1/exp(exp(n)).

Paper link: https://arxiv.org/pdf/2506.15551

They adopted a method that Scott had never thought of: encoding the acceptance probability into the amplitudes of a quantum state, and these amplitudes decrease geometrically.

Facts have proved that this "old friend" QMA, known for 25 years, can still bring surprises.

At an online meeting in August, Scott asked:

Is this double - exponential completeness the limit of black - box technology? Can it be further amplified to triple - exponentially small, i.e., 1/exp(exp(exp(n)))?

Conquered in 30 Minutes, GPT - 5 Shines

A week later, Scott and Freek wrote a complete proof, showing that under black - box technology, the double - exponentially small completeness error is the limit.

In other words, they quantified the oracle separation result of "QMA ≠ QMA₁" in 2008, and the resulting "lower bound" exactly matches the protocol in the June paper.

Perhaps the most striking part of this research is not the quantum complexity itself, but the role of AI in it.

As mentioned before, this is Scott Aaronson's first paper where a key technical step in the main result's proof comes from AI.

Specifically, it's GPT5 - Thinking.

At that time, the author faced a problem: analyzing an N×N Hermitian matrix E(θ) (e.g., N = 2ⁿ), where each element is a poly(n) - degree trigonometric polynomial of the real parameter θ.

What needs to be proved is the maximum eigenvalue of E(θ) when θ changes from 0 to 1, to prove that λₘₐₓ(E(θ)) cannot start from a value close to 0 and then "stay" close to 1 for a long time, for example, close to 1/exp(exp(exp(n))).

Given 1 - 2 weeks, Scott and his co - author could have solved this problem by consulting the literature.

But he chose GPT5 - Thinking. Five minutes later, it gave a confident but obviously wrong answer.

Scott didn't laugh at the AI but told it where it was wrong. After thinking for a while, GPT5 - Thinking tried again and gave a better solution.

After several iterations, just like communicating with a graduate student/colleague, GPT - 5 gave the following function:

It correctly pointed out that this is a rational function of θ with a controllable degree, and it exactly encodes information related to how close the maximum eigenvalue λₘₐₓ(E(θ)) is to 1.

To everyone's delight, this method worked, and the verification could be easily completed without AI assistance.

Scott thinks that perhaps GPT5 has seen a similar structure somewhere in its training data, but if a student proposed this solution, he would unhesitatingly call it "ingenious".

Finally, he recalled that a year ago, he tried a similar problem with the then - available GPT reasoning model, and the result was far from satisfactory.

Now, in September 2025, I can clearly tell you -

AI has begun to truly touch those core tasks that I think are most characteristic of human intelligence: proving oracle separations between quantum complexity classes.

Although it can't write an entire research paper independently yet, if you know what you're doing, it can help you out of trouble, which can be said to be an excellent application scenario.

Who knows how long this situation will last?

Scott Aaronson joked, "Thinking about this, I can't help but be glad that I have a secure job - a tenured position."

References:

https://scottaaronson.blog/?p=9183

https://x.com/SebastienBubeck/status/1972368891239375078

https://x.com/kimmonismus/status/1972399015825203463

This article is from the WeChat official account "New Intelligence Yuan". Author: New Intelligence Yuan. Editor: Taozi. Republished by 36Kr with permission.