HomeArticle

Google AI has solved six world-class problems, which is more shocking than winning a gold medal at the International Mathematical Olympiad (IMO). Terence Tao has pointed out a new way of play.

新智元2026-03-02 08:02
Google DeepMind's latest AI agent, Aletheia, independently solved six world-class mathematical problems in the FirstProof Challenge, achieving a qualitative leap from competition level to PhD research level. The "manual era" of human mathematical research may be approaching its end.

Just now, the last line of defense in the human mathematics community has completely collapsed!

Even onlookers are stunned: AI not only can solve math problems, but now it can independently handle pure mathematical research at the PhD level.

In the past few days, the latest AI research agent Aletheia from Google DeepMind solved 6 out of 10 well - known world - class unsolved mathematical problems in a top - level challenge called "FirstProof" in the mathematics community!

Thang Luong, an executive at DeepMind, couldn't hide his excitement and posted on X:

"For me, this achievement is even more significant than winning the IMO gold medal last year, which was a historic moment!"

This is no ordinary math competition. You know, these problems are extremely difficult even for the world's top mathematicians.

As a result, Aletheia not only calculated the answers independently, but even Jim Fowler, the mathematician who proposed the conjecture of Problem 7, personally confirmed:

"The problem - solving process of the AI is completely correct."

Even Terence Tao, the world's most outstanding genius mathematician, said in a recent interview: AI has become my "junior co - author".

Aletheia's "Master Stroke": Brute - force Deduction

How powerful is Aletheia exactly?

Let's see what Thang Luong, the chief scientist and research director at Google DeepMind and the leader of the super - reasoning team, has to say:

"Extremely excited! Our mathematical research AI agent #Aletheia has just independently solved 6 out of 10 notoriously difficult FirstProof challenge problems and won the first - ever best - in - show award!"

Think about the significance of this statement.

Luong said bluntly:

"In my opinion, this achievement is even more valuable than our historic moment of reaching the IMO (International Mathematical Olympiad) gold - medal level last year!"

Because these problems are "tough nuts" that even the world's top mathematicians find extremely headache - inducing.

This time, DeepMind ran two versions of Aletheia based on Gemini 3 DeepThink (the only difference lies in the underlying models).

After cross - "consultation" by most experts, they jointly solved 6 out of 10 problems (Problems 2, 5, 7, 8, 9, and 10 respectively).

You know, the grading and evaluation process of this set of problems is extremely difficult.

Because there are very few experts in the world who can understand these problems.

But precisely because of this, DeepMind's research process is extremely rigorous:

The entire problem - solving process is purely run by the machine itself, with "zero human intervention" throughout, and it was completely submitted within the deadline stipulated by FirstProof.

This is a milestone moment.

No longer do humans feed formulas step by step. Instead, the AI agent has learned to "grind away" at an extremely complex scientific research problem for a long time, hitting dead - ends thousands of times, and finally coming back to simply report to humans: "I've solved it (or I've failed)."

DeepMind even visualized the computing power (reasoning cost) consumed by Aletheia in this process completely -

Among them, the most astonishing is the amazing comeback of Problem 7 (P7).

This is an atypical problem that no one has been able to solve for several years.

According to Tony Feng, an expert in this field, in this competition, no AI other than Aletheia could get close to the correct answer.

At the beginning, even the DeepMind team itself thought Aletheia had no chance this time, but it actually came up with the correct answer!

To solve P7, Aletheia invested a huge amount of computing power - it was exactly 16 times the amount used to solve the Erdős - 1051 problem!

Sang Hyun Kim, an authority in the mathematics community, gave a very high evaluation after seeing the AI's problem - solving steps:

"This is the first time in my life that I've seen an AI perfectly combine and apply several extremely profound mathematical theorems. This is definitely a unique and rare case!"

All of DeepMind's interpretations of FirstProof and experimental details are here:

Paper link: https://arxiv.org/abs/2602.21201

Not Spouting Nonsense Is the Most Solid Foundation for AI

If you delve into DeepMind's paper, you'll find that the fundamental reason why Aletheia is so reliable is that it has mastered a key skill: "Self - filtering".

Traditional large AI models have a bad habit of pretending to know when they don't (hallucination).

No matter what you ask, they'll seriously fabricate an answer for you.

But in high - level scientific research, if you give mathematicians a bunch of seemingly reasonable but unsubstantiated nonsense, it's better not to give anything.

How did DeepMind solve this problem?

They designed two "sub - personalities" for Aletheia:

One is the "Generator", which is responsible for brainstorming and making wild guesses about problem - solving paths; the other is the cold - blooded "Verifier", which is responsible for nitpicking the "Generator".

In the black box of problem - solving, these two subsystems will fight fiercely with each other.

When faced with the 4 unsolvable problems, Aletheia didn't choose to fabricate answers to get by. Instead, it directly told humans: "No solution found", or simply kept silent when the time limit was up.

Not fabricating answers and never wasting the energy of human experts on uncertain things - this is exactly what makes Aletheia the most reliable in the eyes of top scholars.

As the paper states: "To improve accuracy, we're willing to sacrifice its ability to solve some problems."

In terms of problem - solving cost, except for the "divine problem" P7, which consumed 16 times the computing power, the "mental effort" consumed to solve the other problems also far exceeded the limit of solving the Erdős - 1051 problem last year.

If you want to see the complete interaction logs and problem - solving processes (both correct and incorrect, all presented in their original form), just click here:

GitHub link: https://github.com/google-deepmind/superhuman/tree/main/aletheia

Which "Abnormally Difficult Problems" Did Aletheia Solve?

Let's first look at the specifically mentioned P7.

Problem background: Algebraic topology / Differential geometry. Determine whether the uniform lattice of a semisimple Lie group containing second - order torsion elements can serve as the fundamental group of a compact boundaryless manifold whose universal cover is acyclic in rational homology.

Answer: Impossible.

AI's Ingenious Solution:

Proof idea 1: Pure topological method (contradiction of Lefschetz number)

Using the condition that the universal cover is Q - acyclic, calculate that the Lefschetz number with compact support of the second - order element γ must be non - zero; but γ acts freely (has no fixed points), and through the multiplicativity of the Euler characteristic, it can be deduced that the Lefschetz number must be zero. 0 = ±1, contradiction.

Proof idea 2: Geometric method (rigidity of symmetric spaces)

Using the geometric structure of the lattice, construct an equivariant map from the universal cover to the symmetric space, and prove that the Lefschetz numbers of γ on both sides must be equal. However, it is zero on the side of the universal cover (free action) and non - zero on the side of the symmetric space (guaranteed by the Cartan fixed - point theorem to have fixed points). Contradiction again.

What's Good about It?

The first proof is good for its "simplicity". The problem gives a bunch of conditions, but none of them are used. It solves the problem only with the most basic topological tools, and actually proves a stronger conclusion: Any discrete group containing torsion won't work. The proof chain is extremely short: calculate the Lefschetz number, one side is non - zero and the other is zero, contradiction, end.

The second proof is good for its "depth". It uses all the geometric conditions given in the problem, constructs a map from the universal cover to the symmetric space, and finally finds a contradiction on the symmetric space using the Cartan fixed - point theorem. This path is longer, but it answers a more fundamental question.

Problem background: Number theory / Representation theory. In the representation of matrix groups over non - Archimedean local fields, prove that there exists a universal Whittaker function such that the local Rankin–Selberg integral is non - zero for all paired representations.

Answer: Yes. There exists such a "universal" W.

AI's Ingenious Solution:

First, choose a special Whittaker function W to compress the integration domain to a compact set, so that the complex parameter s completely disappears, and the problem is simplified to proving that a finite functional is non - zero. Then use proof by contradiction: Assume that it is zero for all V, and through finite Fourier analysis, deduce that the test function has "translation invariance", which will force the representation π to have invariant vectors under a coarser subgroup than its conductor, contradicting the definition of the conductor.

What's Good about It?

The most crucial step in the whole proof is the first step of choosing the Whittaker function W. This single choice achieves three things at the same time: 1) Compress the integration domain to a compact