HomeArticle

The mathematics community ignored a "30-year flaw," but GPT-5 spotted it at a glance. Terence Tao: The AI scientific research revolution has begun.

新智元2025-11-05 18:52
GPT-5 aids in mathematical proofs, receiving praise from Terence Tao. AI drives research.

[Introduction] A single spark can start a prairie fire! The dignity of a proof lies in its verifiability; this time, GPT-5 has embedded mathematical evidence in code.

ChatGPT redeems OpenAI's reputation!

After being criticized by Hassabis for being embarrassing, GPT-5 has truly inspired new mathematical conclusions.

Sebastien Bubeck, a scientist at OpenAI, publicly claimed that GPT-5 had solved ten Erdős problems.

However, it was pointed out that GPT didn't solve the Erdős problems but rather found the literature where these problems had already been solved.

After that, he deleted his tweet and stated that he didn't intend to mislead anyone.

Yann LeCun denounced it as "self-inflicted": OpenAI is being harmed by their own GPT enthusiasts.

After that, his post on LinkedIn was significantly more low - key:

Now, there's a reversal -

Sebastien Bubeck has been "wronged", AI is indeed accelerating scientific progress.

Reversal: ChatGPT redeems OpenAI's reputation

Yesterday, there was a reversal in this story -

Princeton University mathematics Ph.D. Boris Alexeev (left in the picture below) and Ohio State University associate professor Dustin G. Mixon (right in the picture below) discovered that Problem 707 of the Erdős problems, which offered a $1000 reward, had been solved 30 years before it was even proposed.

Paper link: https://borisalexeev.com/pdf/erdos707.pdf

This situation is quite absurd, like mathematicians "fighting a phantom enemy" -

The answer predates the question by 30 years, but until recently, the outside world generally thought the problem hadn't been solved!

Currently, Problem 707 of the Erdős problems has been marked as "Disproved".

Link: https://www.erdosproblems.com/go_to/707

This time, Sebastien Bubeck got his revenge and tweeted:

It seems that literature retrieval is not an easy task after all 😅.

The subtext is that it wasn't easy for GPT-5 to find the 10 existing solutions before.

But what follows is even more exciting.

ChatGPT assists in mathematical proof, praised by Terence Tao

The two mathematicians were also skeptical of the results, so they decided to use GPT5 to generate a formal proof in Lean. And surprisingly, they succeeded!

Note ⚠️: ChatGPT and Lean are listed as collaborators, but the content of the paper was still "written" by the authors.

However, humans have put in a lot of effort in this process. They need to continuously provide feedback to GPT5 to improve the formal argument.

On the "Erdős problems" website, there have been many successful cases recently. Researchers have used large language models to find solutions to Erdős problems in existing literature.

It's worth mentioning that Terence Tao had previously successfully demonstrated a proof of concept for using AI to find the "existing answers" to Erdős problems.

Terence Tao also noticed this new proof and thought it was an interesting example of computer - assisted proof.

During the research process, the two mathematicians were convinced that Lean could help verify the authenticity of existing papers, but they were not familiar with Lean at that time and found its interface unfriendly.

However, since ChatGPT can write Lean code, they decided to formalize the entire proof through "vibe coding".

This process took about a week and was quite painful, but they finally succeeded unexpectedly -

In the formal system, ChatGPT strictly proved the negation of the Erdős conjecture.

The final generated proof consists of more than 6000 lines of code, including 26 definitions, 169 lemmas, and 4 theorems (the final counter - example verification part). On an ordinary laptop, the code verification takes less than half a minute.

After several rounds of interaction, Boris and Dustin believe that if the interface of the large language model can be deeply integrated with Lean and appropriately fine - tuned for this interaction mode, many problems will be greatly alleviated.

Even a small amount of targeted optimization is enough to make the experience of this "human - machine collaborative proof" smoother and more natural.

Terence Tao highly approves of this AI - assisted proof. He said that this is one of the rare cases of responsibly using LLM output in research papers:

Importantly, no output generated by the LLM was directly put into the main text (except for quoting Lean code snippets generated by the LLM for illustrative purposes);

Instead, such output was only used in a fully verifiable context (in this case, to generate code that can be type - checked by Lean).

However, Terence Tao emphasized: "Lean formalization is only a supplement to human proof and cannot replace it."

In addition, he can almost foresee some exaggerated reports - "This time, the LLM really solved an Erdős problem!"

- But the truth is far more complex and subtle. To draw any conclusions, one needs to carefully sort out the whole story.

GPT - 5 drives research, early signs emerge

Paata Ivanisvili, a mathematics professor at the University of California, Irvine, also listed ChatGPT as a co - author of his paper.

The new paper is a collaboration between mathematics professor Paata Ivanisvili and Xinyuan Xie (Xie Xinyuan), an alumnus of the University of Science and Technology of China from the class of 2022, with ChatGPT as the first author.

This exploration originated from their request to GPT - 5 Pro to find counter - examples in the publicly available unsolved problems (below 👇):

  • Link: https://simons.berkeley.edu/sites/default/files/openprobsmerged.pdf
  • Title: Real Analysis in Computer Science: A collection of Open Problems

After several numerical experiments, it proposed a counter - example for the Non - Interactive Correlation Distillation (NICD) problem with erasures:

A Boolean function defined on 5 bits, when the erasure parameter p = 0.40, its E|f(z)| value is strictly greater than the corresponding value of the 5 - bit majority function.

They recorded this discovery and verified the entire calculation process.

This result echoes the classic counter - example in linear threshold functions about "Majority is Least Stable". Even if the AI only applies the known counter - example pattern to a new scenario and verifies it, its contribution is still worthy of recognition.

Link: https://arxiv.org/abs/1703.07657

This is the "single spark" of AI in theoretical computer science: previously, large language models (LLMs) were mostly used for literature retrieval or numerical assistance, but this time, it truly generated a specific, finite, and verifiable counter - example.

In addition, Ernest Ryu, a mathematics professor at UCLA, used GPT - 5 Pro to solve an open problem in the field of convex optimization.

Although about 80% of the proof attempts by the model were wrong, it proposed many novel ideas.

Specific contributions of GPT - 5 Pro:

  • It provided the final feasible proof idea and argument framework.
  • It significantly accelerated the exploration process by quickly eliminating ineffective routes.

This work took about 12 hours and was completed in 3 days. Afterward, Ernest Ryu recalled that this proof was actually very simple.

Key steps of the proof generated by ChatGPT: