HomeArticle

Terence Tao sounds the alarm. Google DeepMind joins forces with five major institutions to wage war on a century-old problem with AI.

新智元2025-10-30 12:09
The "Oppenheimer Moment" in the field of mathematics

Google DeepMind strikes again, assembling five top global institutions to target the holy grail of the mathematical world with the power of AI! Meanwhile, Terence Tao has issued a sobering warning: we must be vigilant about the potential risks brought by the misuse of AI.

Today, Google DeepMind has launched the "AI-Enabled Mathematics Initiative" and assembled five top global institutions.

They will use Google's most powerful mathematical AI to explore and discover new frontiers.

Among them, there is Gemini Deep Think, which won the IMO gold medal, the AI agent AlphaEvolve for algorithm discovery, and the automatic completion tool for formal proofs, AlphaProof.

Currently, the lineup of the first batch of partner institutions is truly impressive:

Imperial College London

The Institute for Advanced Study (IAS)

The Institute of Advanced Scientific Studies (IHES)

The Simons Institute for the Theory of Computing (University of California, Berkeley)

The Tata Institute of Fundamental Research (TIFR)

These five institutions share a common mission: to discover mathematical problems that can be illuminated by AI and accelerate scientific discovery.

However, Terence Tao is worried that "as the application of AI in mathematical research deepens, in addition to responsible use, cases of AI misuse are also common."

Therefore, he believes that it is high time to initiate discussions on how to best integrate AI, transparently disclose its role, and mitigate risks.

Perhaps this can not only safeguard the rigor of mathematical research but also pave the way for the integration of AI and mathematics.

Five top institutions join forces to tackle mathematical problems

Mathematics is the most fundamental language of the universe.

In the view of Google DeepMind, AI can serve as a powerful tool to collaborate with mathematicians and stimulate their creativity.

The birth of the "AI-Enabled Mathematics Initiative" is for the following purposes:

Discover a new generation of mathematical problems that are expected to gain profound insights with the help of AI;

Build the infrastructure and tools required to support these frontier explorations;

Ultimately accelerate the pace of scientific discovery.

This initiative will be funded by Google.org and supported by Google DeepMind's top-notch technology.

Over the past few months, Google DeepMind's own research has made rapid progress.

In 2024, AlphaGeometry and AlphaProof won silver medals in the IMO competition.

The latest Gemini model equipped with Deep Think even achieved a gold-medal-level performance in this year's IMO, perfectly solving 5 questions and scoring 35 points.

In May this year, Google DeepMind released AlphaEvolve, which can be regarded as the most powerful general AI agent.

In 20% of the 50 open problems in the fields of mathematical analysis, geometry, combinatorics, and number theory, AlphaEvolve achieved the optimal solution.

Moreover, in the field of mathematics and algorithm discovery, it invented a new and more efficient matrix multiplication method.

Specifically, in the specific problem of 4x4 matrix multiplication, it discovered an algorithm that only requires 48 scalar multiplications.

This result broke the 50-year historical record set by the Strassen algorithm in 1969.

Not only that, in the field of computer science, AlphaEvolve helped researchers discover a new mathematical structure.

At the same time, it also found that the difficulty of solving some complex problems is actually higher than people previously thought, which allows researchers to see the computational boundaries more clearly and accurately, and paves the way for future research.

These above-mentioned progressions are all strong evidence of the rapid development of current AI models.

Human understanding of the full potential of AI and how it can solve the most profound scientific problems has just begun.

Where are the boundaries of AI + mathematics?

Terence Tao has always been an advocate and best practitioner of the application in the field of "AI + mathematics".

He has collaborated with top AIs such as GPT-5 Pro many times to solve many difficult problems in the field of mathematics, greatly improving efficiency.

Undoubtedly, in the field of mathematics, AI tools such as LLMs and proof assistants are quietly changing the research paradigm.

Recently, some top papers have begun to integrate AI, promoting innovation from formal proofs to complex calculations.

Paper address: https://borisalexeev.com/pdf/erdos707.pdf

However, with the in - depth involvement of AI, a key question has also arisen:

How to ensure that the use of these tools does not damage the rigor and value of the papers?

Terence Tao's suggestions

Taking this opportunity, Terence Tao initiated a discussion on the public platform. In a long post, he put forward three major suggestions.

Hereinafter, the term "AI" not only covers LLMs but also includes neural networks, satisfiability solvers, proof assistants, and any other complex tools.

1. Declaration of AI use

In papers, all substantial uses of AI beyond its basic functions, such as auto - completion, spell - checking, or AI summaries from search engines, must be clearly declared.

2. Discussion of AI risks and mitigation measures

In papers, the general risks that the used AI tools may bring should be discussed, and the measures taken to mitigate these risks should be explained.

The following are examples:

2.1. Fabricated content, "hallucinations" occur

AI may fabricate references, proof processes, or texts, leading to factual errors.

It is recommended not to use AI - generated text in the main body of the paper; if AI output must be used, it should be clearly marked with a different font or symbol.

2.2. Lack of reproducibility

The results of proprietary AI or those with high computational costs are difficult to reproduce. The solution is to open - source prompts, workflows, certified data, etc., so that others can verify at low cost.

2.3. Lack of interpretability

AI output is often obscure, and its explanations may not hold water. It is recommended to accompany each AI output with human - written, readable corresponding content.

For example, a theorem can include both a human - written, easy - to - read non - formal proof and an AI - generated but hard - to - read formal proof.

2.4. Lack of verifiability

AI is prone to hiding subtle errors, and checking is time - consuming.

Formal verification and consistency checks are helpful in alleviating this problem, and a multi - level approach should be adopted.

The key is to mark the verification scope, add a "verification mark" beside the theorem, and clearly state the unverified parts.

2.5. Improper formalization of goals

AI may precisely solve "misaligned" goals, that is, the formalized propositions deviate from the author's intention. For this, formalized goals should be obtained from independent sources, or the formalization process should be thoroughly reviewed by humans.

2.6. Possible exploitation of loopholes to achieve goals

Related to the previous problem, AI may take advantage of loopholes in formalized statements, such as adding arbitrary axioms to "prove" propositions.

The countermeasure is to list known loopholes and discuss exclusion mechanisms to ensure the rigor of the process.

2.7. Bugs in AI - generated code

Bugs in AI - generated code are more hidden and difficult to detect and repair using traditional standard methods.

For this, it is recommended to use a large number of unit tests and external verifications, or limit the use of AI to simple scenarios, and complex tasks should be modified and adapted by humans.

3. Attribution of responsibility

Ultimately, all authors of the paper must take responsibility for the content contributed by AI, including any inaccuracies, omissions, or false statements.

Unless clearly marked as "unverified", the authors cannot shirk their responsibility.

The above are just Terence Tao's preliminary ideas. He hopes to have more discussions and further improve this list with researchers in the industry.

Below the comments, a researcher, John Dvorak, hit the nail on the head:

Unless we can cross the critical point and make all mathematical proofs use Lean for formal verification as a standard in the academic community, this problem is basically unsolvable.

After all, before Lean becomes popular, these methods only treat the symptoms rather than the root cause.

In response, Terence Tao put forward a view he recently saw, that is, the quality of using AI for paper review is acceptable, but it is not one of the main screening tools.

Otherwise, "Goodhart's law" will be triggered. AI tools will find loopholes and bypass the review with some abnormal, out - of - distribution text strings.

To put it simply, AI evaluators can at most assist human reviewers and cannot completely replace human evaluators.

Reference materials:

https://blog.google/technology/google-deepmind/ai