Terence Tao revolutioniert das wissenschaftliche Forschungsparadigma mit KI.
[Introduction] Terence Tao asked ChatGPT to translate a complex mathematical paper into Lean code and completed a formal proof in collaboration with AI. AI can understand papers and write correct propositions, but often gets stuck at crucial points. After human - machine cooperation, a verified 1125 - line proof was finally generated. This kind of "vibe coding" - style cooperation also makes mathematicians rethink: AI may not be an independent problem - solver, but it is profoundly changing the way of mathematical research.
The whiteboard was useless in that night's mathematical derivation.
Terence Tao stared at the screen as Lean, like a harsh referee, spat out lines of red characters.
After repeated back - and - forths, the error messages suddenly went silent.
1125 lines of Lean code were finalized - a complex counterexample to Erdős' Problem 613 was checked line by line into the formal world.
The writer was ChatGPT, the ideas were coordinated by Terence Tao, and the decision was made by the machine.
For a well - known unsolved mathematical problem, Fields Medalist Terence Tao invited ChatGPT and the mathematical proof assistant Lean to jointly complete a tedious and rigorous task: formalize a complex counterexample proof.
This counterexample originated from Problem 613 proposed by Paul Erdős, a problem that has puzzled mathematicians for decades.
https://www.erdosproblems.com/forum/thread/613
As early as the beginning of this century, some mathematicians gave a counterexample proof, "falsifying" this problem (that is, finding a counterexample to prove that the original conjecture does not hold).
But no one had ever tried to completely translate this proof into a computer - verifiable form because it meant writing all the reasoning details into formal logical code, which was an extremely large workload.
Terence Tao decided to be the first to try: he let ChatGPT first act as his "translator" and "assistant" to transform the human pen - and - paper proof into rigorous Lean - language code.
ChatGPT Reads Papers, and the Translator of Mathematical Jargon Goes Online
Terence Tao first asked ChatGPT to read the proof construction in the paper.
The mathematical descriptions in papers are often full of symbols and jargon, but ChatGPT is like an indefatigable teaching assistant. It can explain what these constructions mean paragraph by paragraph and then try to express them in a more "mechanical" way.
For example, the paper constructs a special graph (satisfying certain vertex and edge counting conditions) as a counterexample. ChatGPT can extract the key conditions from the text description and even translate it into the definitions required by Lean.
It is like translating obscure ancient Chinese into vernacular to ensure that every step is clear.
Of course, ChatGPT doesn't really understand profound mathematical concepts. It mostly relies on pattern matching and probability generation.
But in this scenario, it does show amazing "reading comprehension" ability.
Terence Tao asked it to express the propositions in the paper in Lean language, and ChatGPT almost immediately gave the correct definitions and proposition statements.
Sometimes, it would even take the initiative to "shine", for example, proving a property of a lemma without being prompted.
Such moments surprised Terence Tao, as if the AI student suddenly had an epiphany.
However, the excitement didn't last long. ChatGPT soon got stuck at the last step of the proof.
It could read and restate most of the content, but got stuck at the places where real creative leaps were needed.
After all, it is not a real mathematician but just plays the role of a skilled translator and a junior problem - solving assistant.
Human - Machine Collaboration, and 1125 Lines of Code Appear Out of Nowhere
Next came the patient work: guiding ChatGPT step by step to write Lean code, which is the so - called "vibe coding" process.
The so - called "vibe coding" means that humans do not give overly detailed and strict instructions but let the AI build the code step by step based on intuition and the overall idea, just like an improvised ensemble.
In this process, Terence Tao was more like a band conductor, providing direction and rhythm, while ChatGPT "played" code snippets improvisationally.
Lean acted as a strict referee, checking each code segment immediately. If it was wrong, the error message was like a "wrong note", and adjustments were needed.
This human - machine collaboration experience was both amazing and ridiculous.
ChatGPT sometimes showed superb "skills": it could even guess the intermediate lemmas that the mathematician wanted to prove and directly give the corresponding Lean proof ideas!
It could rattle off many common definitions and basic lemmas very quickly.
This saved Terence Tao a lot of time spent on looking up the Lean library and grammar, which was like having a super stenographer familiar with the Lean language by his side.
However, when it came to more complex or delicate parts, the AI started to "go off - key": it often wrote a long string of Lean code to no avail, either with illogical reasoning or inconsistent with previous definitions. Lean would mercilessly report errors, and sometimes ChatGPT looked innocent and couldn't tell where the mistake was, requiring human patience to correct it.
The AI kept going in circles, either forgetting the premises or citing the wrong theorems, making simple problems complicated.
Terence Tao had to give hints again and again: "Hey, you should prove this basic property. Don't stray too far."
After such back - and - forths, this "small goal" was finally achieved.
After nearly a week of "grinding", ChatGPT and Terence Tao finally completed the formalization of the entire counterexample proof.
The Lean code was 1125 lines long, like a mini - masterpiece.
https://github.com/teorth/analysis/blob/main/analysis/Analysis/Misc/erdos_613.lean
Looking back at these codes, the author joked that they were completely a pile of "spaghetti code" - with a complicated structure, full of long detours generated by the AI and mid - way changed ideas.
Normally, programmers might be very headache - stricken when seeing such code; but in mathematical proofs, this is not a big problem.
Because Lean finally verified it, it means that every sentence and every reasoning step is logically correct.
Even if the code looks redundant, as long as it can be accepted by Lean, the proof holds in a rigorous sense.
As Terence Tao said, Lean is simply a grand stage for "vibe coding".
AI Makes Blunders, and Humans Clean Up. Which Takes More Time?
Some people may ask: Is it really time - saving to let the AI mess around and spit out thousands of lines of rambling code?
Terence Tao's answer is yes.
Although interacting with ChatGPT was sometimes frustrating, compared with writing this 1125 - line Lean proof from scratch by himself, the AI saved him at least more than half of the time and energy.
More interestingly, ChatGPT could also detect some small mistakes when Terence Tao made requests during the conversation, such as improper parameter values, and then automatically correct them before generating the code.
It is not only an obedient coder but also occasionally acts as a "quality inspector" to check for humans.
This experience made Terence Tao extremely excited. He now dares to confidently hand over the tedious calculations that he used to think were not worth trying to the AI, while he focuses on the more creative parts.
Of course, it doesn't mean that AI is omnipotent.
In fact, in the process of formally writing the Lean proof, a large amount of low - level and repetitive finishing work was still done by humans in the end.
The code snippets written by ChatGPT often needed Terence Tao to carefully check, fine - tune the format, and then paste them into Lean to run and see if they passed.
Once an error was reported, he had to go back and prompt ChatGPT to modify it.
Many times, the AI would get stuck in a narrow mindset and keep producing the same wrong code, requiring human patience to guide it out of the dead - end.
All this shows that AI currently plays the role of a "powerful assistant" rather than an independent mathematician.
As the daily briefing of Nature magazine pointed out, these tools can help mathematicians confirm some almost unfathomable proofs and come up with strategies for difficult problems, but there is still a long way to go before they can automatically produce complete new proofs.
Human wisdom is still indispensable. At least for now, AI can't provide the most wonderful creativity and insights.
{1, 2, 4, 8, 13} Overturns Erdős' Conjecture
Another sensational case happened to Erdős' Problem 707.
This problem is about the relationship between Sidon sets and perfect difference sets in combinatorial mathematics - it sounds esoteric, but simply put, Erdős conjectured that any special "Sidon set" could be extended to a certain "perfect difference set".
This conjecture has been unresolved for decades, with a $1000 reward.
Until recently, two mathematicians, Boris Alexeev and Dustin G. Mixon, found a surprising counterexample: the set {1, 2, 4, 8, 13} is a Sidon set that cannot be extended to a perfect difference set!
Five seemingly ordinary numbers ended a long - standing conjecture, exciting and surprising the mathematical community.
Discovering the counterexample is only half of the story.
The two researchers made a bold decision: let AI verify their discovery.
They heard that Terence Tao successfully used ChatGPT to write Lean proofs, so they followed suit and invited the latest large - scale model to assist, writing the counterexample proof from beginning to end in Lean code.
They not only formalized the new counterexample they found but also let the AI write a Lean proof for another counterexample given by a mathematician, Marshall Hall Jr., decades ago.
Actually, Hall's result was published as early as the 1940s but was long ignored by the academic community.
In his 1947 paper "Cyclic projective planes" (Duke Math. J. 14(4): 1079–1090), in the paragraph after Theorem 4.3, Marshall Hall Jr. gave a specific counterexample of a set that cannot be extended to any finite perfect difference set (a difference set with λ = 1, also known as a planar difference set).
The example he gave in the original text was:
“For example the set {−8, −6, 0, 1, 4} may not be so extended.”