How Did the Step Taken 10 Years Ago Transform Today's AI?

Move 37 of AlphaGo Changed the Perception of AI, Machines Can Innovate Independently

Many people talking about AI today are all focused on new things.

Larger models, longer contexts, and more human-like responses.

But in an episode of a podcast on March 11, 2026, Google DeepMind talked about the match 10 years ago when AlphaGo defeated Lee Sedol.

They called that moment "the inflection point of AI."

Why was it that match?

Because in that match, one move changed many people's understanding of AI.

It made people realize for the first time that AI doesn't just mimic humans; it might find paths that humans have never taken. And once this ability goes beyond the chessboard, what it changes won't be limited to just Go.

How did that move 10 years ago influence things all the way to today and even change the development direction of AI?

Section 1 | Why Go?

To understand why that move was important, we first need to go back to a question researchers faced at that time: Why was Go long considered one of the most difficult fields for artificial intelligence to conquer?

In this podcast episode, Thore Graepel, the core architect of AlphaGo, recalled that Go was almost a "perfect challenge" in the eyes of AI researchers.

The reason isn't complicated: The rules of this game are very simple, but once the game starts, the situation quickly becomes extremely complex.

Every seemingly ordinary move on the board might have a chain reaction dozens of moves later. And these effects are often hard to foresee in advance.

If we're just comparing board games, many people would think of chess. As early as 1997, IBM's Deep Blue defeated world champion Garry Kasparov. At that time, many people thought that machines would soon make similar breakthroughs in Go.

But the result was completely different.

Because from a computational perspective, the complexity of Go far exceeds that of chess.

In chess, a game usually requires considering about sixty to seventy moves, while a Go game often lasts two to three hundred moves, and there are a large number of possible move positions at each step.

This means that the number of possible changes in the game situation grows exponentially and quickly exceeds the scope that traditional computational methods can exhaustively enumerate.

Pushmeet Kohli, the head of science at DeepMind, explained this difference in the podcast. The difficulty of Go isn't just because there are many possible moves; the key is that the game process is long and requires continuous deduction of many layers of changes.

For machines, this means they need to find a reasonable path in an unimaginably huge space.

When human players face such a complex situation, they have their own way of dealing with it. They don't calculate all possible changes but rely on experience and intuition to first screen out "promising" directions and then conduct further deductions.

The problem is that early artificial intelligence didn't have this ability.

Traditional artificial intelligence methods rely on a large amount of computation and try different moves continuously to find better results, but they quickly encounter bottlenecks in complex problems like Go. So for a long time, Go was regarded as a difficult problem in the field of AI because it not only tests computational ability but also requires an ability similar to human intuition.

When DeepMind started researching Go, they tried to combine these two ways of thinking.

On the one hand, use deep learning to learn the "promising" move directions in the game;

On the other hand, use computational methods to deduce possible subsequent changes.

In other words, the machine needs to be able to quickly see the general direction and also be able to conduct in-depth analysis in critical situations.

This method made researchers see the possibility of a breakthrough for the first time.

Section 2 | Move 37: The Machine Found a New Path

If we only look at the game result, AlphaGo's 4:1 victory over Lee Sedol might be understood as a technological advancement.

But what really made people remember it was one move in the second game.

AlphaGo made a "shoulder hit" move on the fifth line of the board.

At that time, professional player Michael Redmond in the commentary booth thought there was a problem with the record for a moment.

He picked up the piece and then put it down because in traditional Go theory, this was hardly a position that a human player would seriously consider.

Later, when the DeepMind team recalled this moment, they mentioned a detail: In AlphaGo's model, if we statistically analyze based on the historical game records of human players, the probability of a move like the 37th move appearing was only one in ten thousand.

As the game continued, many layouts that originally seemed unreasonable began to show their effects gradually. Dozens of moves later, people gradually realized that this move wasn't an accidental attempt but a strategy different from traditional thinking.

It changed the power distribution on both sides of the board and also changed the understanding of the relationship between territory and influence for both sides.

Thore Graepel recalled in the podcast that a professional player sitting next to him at that time didn't understand the meaning of this move at all at first and even said that he would clearly tell his students not to make such a move in normal times.

But after the game ended, that player came back specifically to tell him that this was the most unforgettable game he had ever seen because the machine used a brand - new way of playing.

This is the significance of Move 37.

This move wasn't directly learned from human game records but a new way of playing formed during the exploration process. It proved one thing: Machines can go beyond existing experience and find new solutions.

So many researchers later regarded that moment as a turning point.

Section 3 | AlphaZero: No Need for Human Experience

The DeepMind team also started thinking: What other possibilities does this ability have?

The answer came quickly.

Not long after AlphaGo defeated Lee Sedol, the DeepMind team made an attempt that seemed simple but was quite bold at that time: They no longer used any human game records.

The machine no longer learned from millions of professional players' games but only knew two things:

The rules of Go and the criteria for victory and defeat.

Then they let it play against itself continuously and gradually find better ways of playing through repeated attempts.

This is how AlphaZero works.

At the beginning, the machine almost knew nothing. It just kept playing and continuously adjusted its strategy. But as the number of games increased, it would gradually form its own understanding: which moves were more promising and which situations were more advantageous.

The DeepMind team found that in the early stage of learning, the machine would slowly "rediscover" many classic ways of playing that had long existed in Go. It would almost try all the experience summarized by humans over hundreds of years again. After further exploration, it would start to abandon some of them.

Because it found some more effective ways.

Graepel said in the podcast that this was exactly what excited researchers the most about AlphaZero: It can not only rediscover human knowledge but also find ways of playing that humans haven't thought of on this basis.

Moreover, someone foresaw this ability during the Seoul game.

The filming crew shooting the AlphaGo documentary was packing up their equipment at that time, but the microphone was still on.

They accidentally recorded a conversation.

Demis Hassabis, the CEO of Google DeepMind, and David Silver, the chief research scientist, were chatting.

Demis said, "It's amazing to see this problem that was once considered impossible being solved so quickly."

Then he paused for a moment and continued, "I'm sure we can do protein folding now. I thought we could before, but now we definitely can."

Section 4 | From the Chessboard to the Laboratory

They did achieve it. The most well - known example is AlphaFold.

In biology, how proteins fold into three - dimensional structures has always been an extremely difficult problem.

Scientists already know the amino acid sequence of proteins, but it often takes years of experiments to infer the final spatial form they will take.

Through learning a large amount of data and physical laws, AlphaFold gave prediction results close to experimental accuracy in the 2020 CASP competition.

Many researchers later commented that this work significantly accelerated the research speed in structural biology.

Similar things also happened in the fields of mathematics and computing.

Matrix multiplication is one of the most basic operations in computer science, but for decades, people have hardly found a more efficient algorithm.

DeepMind let the model continuously try among a large number of possible computational steps and found some new algorithmic paths. Some of these solutions save more computational resources than the methods humans used before.

This is what AlphaTensor does.

Another example is AlphaEvolve.

The research team applied the way of exploring strategies to engineering problems, such as optimizing the resource allocation of data centers and improving logistics routes.

In these scenarios, the machine will search for better arrangements among a large number of possible solutions, and some results also go beyond the original design ideas of engineers.

From protein folding to matrix multiplication and then to engineering optimization.

Behind these breakthroughs is the same method: Let the machine explore independently in an environment with clear rules and find paths that humans haven't noticed.

This method was first verified in Go.

So DeepMind researchers often go back to that game later.

When people ask why artificial intelligence has suddenly made so much progress in recent years, they often mention:

That move on the chessboard.

Conclusion | Looking Back at That Move 10 Years Later

Many people mark the starting point of the AI wave in the years when large models emerged.

DeepMind researchers prefer to go back to 2016.

That Go game proved one thing: Machines don't just learn; they can create.

After this ability was verified, the entire research direction changed.

That move 10 years ago didn't change the outcome of a single game.

What it changed was that people began to believe: Given enough rules and exploration space, machines might find new paths that humans haven't thought of.

This logic has been repeatedly verified in the decade after Go.

And this is just the beginning.

📮 Original Article Link:

https://www.youtube.com/watch?v=qoinGjj60Fo&t=1432s

https://deepmind.google/research/alphago/?utm_source=chatgpt.com

This article is from the WeChat official account "AI Deep Researcher". Author: AI Deep Researcher. Republished by 36Kr with permission.