HomeArticle

Humans couldn't solve them in 56 years, but Google AI cracked 9 century-old problems overnight.

新智元2026-05-25 19:10
The spark of the mathematical singularity has been ignited.

  Reported by New Intelligence Yuan  

[Introduction] DeepMind has released a brand - new mathematical agent, AlphaProof Nexus, which has solved 9 open Erdős problems all at once. The oldest one has been unsolved for 56 years! All the proofs have been formally verified by the Lean compiler without hallucinations. Netizens exclaimed that the spark of the mathematical singularity has been ignited.

The mathematical community has gone completely crazy this month.

Just after OpenAI overturned the 80 - year - old Erdős conjecture, the exclamations of mathematicians haven't even died down.

Immediately afterwards, Google DeepMind released a brand - new AI mathematical agent - AlphaProof Nexus.

Paper link: https://arxiv.org/abs/2605.22763v1

As soon as it made its move, it solved 9 long - standing open Erdős problems that had remained unsolved for decades. The oldest one has been unsolved for a full 56 years!

Moreover, the computing power cost for each problem is only a few hundred dollars.

More importantly, this time the proof cannot be wrong.

Each step of the reasoning has been formally verified by the Lean compiler, leaving no room for hallucinations. If the compiler passes, the proof is correct.

It's worth mentioning that AlphaProof Nexus is completely different from the original AlphaProof, which won a silver medal at the IMO in 2024.

The original only had reinforcement learning tree search, while Nexus combines large language models, AlphaProof, and evolutionary algorithms, directly targeting research - level problems that human mathematicians can't solve.

AlphaProof Nexus: Evolutionary Algorithm + LLM + Lean Compiler

The architecture of this system is divided into four levels, from simple to complex.

1. Agent A (Basic Version)

Multiple independent proof sub - agents work in parallel. Each sub - agent has multiple rounds of conversations with Gemini 3.1 Pro, modifies Lean code through search - and - replace tools, and the compiler provides real - time feedback on error information. The sub - agents iterate and correct according to the feedback.

2. Agent B

On the basis of A, AlphaProof is added as a tool. When a sub - agent gets stuck on a sub - goal, it can call AlphaProof for reinforcement - learning - driven tree search to try to overcome local difficulties.

3. Agent C

An evolutionary algorithm is introduced. Multiple sub - agents no longer work independently but share a "population database". Each proof draft is scored by an LLM reviewer (using the Elo rating system), and high - scoring drafts are preferentially sampled, mutated, and evolved.

4. Agent D (Full Version)

The all - in - one version. The evolutionary algorithm + AlphaProof + Gemini 3.1 Pro work together. This is the main weapon that DeepMind uses to sweep through Erdős problems on a large scale.

The core cycle of the entire workflow is very clear:

The AI proposes a proof draft → The Lean compiler verifies it → If it fails, error information is fed back → The AI makes corrections → Verification is carried out again → The cycle repeats until the proof is completely passed or the computing power budget is exhausted.

Take Erdős #125 as an example. Its problem - solving process is as follows.

First, the sub - agent uses chain - of - thought reasoning to analyze the problem structure, then modifies the Lean code through search - and - replace, and then calls AlphaProof to handle sub - goals.

AlphaProof solved 3 out of 6 sub - goals. The sub - agent then decomposed the remaining "hard nuts" into smaller lemmas and called AlphaProof again - this time, all were solved.

Throughout the process, no human mathematicians intervened.

9 Erdős Problems, a 56 - Year - Old Unsolved Case Solved Overnight

DeepMind deployed the full - version Agent D on 353 formalized Erdős problems. Each problem was allowed a maximum of 3000 rounds of iteration.

Finally, 9 problems were solved.

Among them, the most valuable ones are:

1. Erdős #12 (Proposed in 1970)

Does there exist an infinite set A such that for any three distinct elements a < b, c, there is no a that divides b + c, and the density of A among the first N positive integers reaches the order of N^(1/2)?

This problem had remained unsolved for 56 years. During this period, many mathematicians made partial progress, but they were never able to give a complete construction.

The AI's solution ingeniously combines the Chinese Remainder Theorem and the three - term arithmetic progression avoidance set. By constructing a series of carefully designed "blocks", it simultaneously satisfies the density condition and the divisibility constraint.

2. Erdős #125 (Proposed in 1996)

For the set of integers A that only uses the digits 0 and 1 in base - 3 and the set of integers B that only uses the digits 0 and 1 in base - 4, is the lower density of their sum set A + B positive?

The AI proved that the answer is no - the lower density is zero.

The core of the proof is an inductive sparsification argument, which cleverly utilizes the Diophantine approximation properties of 3^m and 4^k (log4/log3 is an irrational number). By repeatedly finding two scales where the bases are almost aligned, the density gradually decays to zero at a rate of 0.99.

3. Erdős #138 (A Variant Proposed in 1981)

Does the difference between the van der Waerden numbers W(k + 1) - W(k) tend to infinity?

The AI gave an extremely elegant proof: W(k + 1) ≥ W(k)+k. The core idea is greedy coloring expansion - on the basis of a 2 - coloring without a monochromatic k - AP, new elements are added one by one, and the proof by contradiction shows that the greedy strategy will not fail.

4. Erdős #846

This is a problem about the collinearity property in a set of points on a plane.

The AI's construction is amazing.

It maps each edge of the complete graph K∞ to a point on the plane, encodes the coordinates with a quadratic polynomial, and then uses the infinite Ramsey theorem to complete the proof.

Currently, the Lean proof codes for all 9 problems have been open - sourced on GitHub.

Project link: https://github.com/google - deepmind/alphaproof - nexus - results

Can the Simple Agent Solve All 9 Problems?!

The most unexpected conclusion is not how powerful the full - version Agent D is, but rather -

The simplest Agent A can also solve all 9 problems.

Agent A has no evolutionary algorithm and no AlphaProof. It only has multiple independent LLM sub - agents and a feedback loop with the Lean compiler.

According to the comparative analysis of the DeepMind team, on most problems, the performance of Agent A and Agent B (the version with AlphaProof added) is almost the same within the margin of error.

In contrast, the advantage of Agent D is mainly reflected in the most difficult problems (such as #125 and #138), and it can complete the proof with a 2 - to 5 - fold cost advantage.

DeepMind attributes the success of the basic agent to two factors: the soaring ability of the LLM itself and the powerful role of compiler feedback in anchoring LLM reasoning.

That is to say, as the basic models become stronger and stronger, complex system engineering may gradually give way to simple agent loops.

The problems that today require the coordinated operation of evolutionary algorithms and AlphaProof to be solved efficiently may be solvable with a simple LLM + compiler loop tomorrow.

Specifically, for the cost, the median cost of the cheapest problem (#741(ii)) is only $5 - 7, and the most expensive one (#152) is no more than $200 - 400.

But the premise is to use the right model - if you run AlphaProof alone or use a smaller model (such as Gemini 3.0 Flash), none of the 9 problems can be solved.

Solving a 15 - Year - Old Algebraic Geometry Mystery and a New Bound in Convex Optimization

In addition to the Erdős problems, AlphaProof Nexus has also made substantial breakthroughs in multiple branches of mathematics:

OEIS Conjectures: The system automatically formalized 492 open conjectures and proved 44 of them. To prevent formalization errors, the system requires proving a "test lemma" first - verifying that the first few terms of the sequence are consistent with the formalized definition - before attempting the target conjecture.

Algebraic Geometry: It solved an open problem that had remained unsolved for about 15 years - proving the log - concavity of pure O - sequences of codimension 3 and type 2. This problem was previously considered the last major unsolved case in this field.

Convex Optimization: It solved an open problem about the exact convergence rate of the Anchored Gradient Descent - Ascent (Anchored GDA) algorithm. Even better, the AI not only verified a fixed algorithm but also independently searched for and discovered a new learning rate scheduling parameter during the proof process, thus achieving stronger guarantees.

Graph Theory: It proved a conjecture about the number of leaves in a spanning tree and local independent sets proposed by the Graffiti system in 1996, forming an interesting closed - loop - the AI proved a conjecture proposed by another AI.

Additive Combinatorics: It helped solve the 57th problem on Ben Green's famous list of open problems.

Quantum Optics: In cooperation with Mario Krenn, it solved multiple conjectures about monochromatic quantum graphs, corresponding to the construction of high - dimensional GHZ quantum states.