10th Anniversary of AlphaGo's Match vs Lee Sedol: In - depth Revelation of the Five

The 37th move that shocked the world

In Seoul in 2016, the man - machine battle between AlphaGo and Lee Sedol captivated the world's attention. In the second game, the 37th move made by AlphaGo defied all human players' predictions. The on - site commentator exclaimed, "I don't understand!" And Lee Sedol himself took a full 12 minutes to make his move after careful thought.

What is little known is that behind AlphaGo's world - shocking move was the firm decision of Demis Hassabis, the head of DeepMind. During the preparation, to avoid system errors, the researchers suggested minimizing the possibility of random moves. However, Hassabis overruled the majority opinion, saying, "We are developing AI not to replicate human thinking but to explore the unknown boundaries of intelligence."

This five - day legendary showdown was fully restored in Hassabis' first officially authorized biography, Demis Hassabis: The Brain of Google AI. We have excerpted this exciting story from the book for our readers.

In January 2016, Nature magazine published DeepMind's paper on Go as scheduled and featured it on the cover again. One day before the paper was published, the magazine distributed embargoed copies of the article to journalists as usual. A journalist contacted Facebook for comments, and the news quickly reached Mark Zuckerberg. Zuckerberg showed the same competitive edge as when he tried to poach Koray Kavukcuoglu, the research director of DeepMind. He hastily issued a statement before the Nature paper was made public, hyping up Facebook's far - less - impressive Go project. Journalist Cade Metz commented that it was "a strange and unfortunate pre - emptive public relations attempt," which also foreshadowed the upcoming AI competition.

The media dismissed Facebook's statement and focused on DeepMind instead. After defeating Fan Hui, DeepMind's agent (now named AlphaGo) defeated a human Go champion for the first time, about 10 years earlier than experts had expected. Hassabis announced at the release of the Nature cover article that in March, AlphaGo would compete against Lee Sedol, a legendary South Korean Go master and an 18 - time international tournament champion. DeepMind also set up a $1 million prize for this event.

Hassabis was very deliberate in choosing his opponent. His initial idea was to compete against a Japanese champion, but at that time, no Japanese players were at the top level. South Korea and China are the two leading countries in Go in the world. Considering these options, Hassabis quickly chose Lee Sedol, not only because of his professional achievements but also because of the spirit he represented. The match between Lee Sedol and AlphaGo was like the showdown between Garry Kasparov and IBM's Deep Blue, which would arouse greater enthusiasm among Go - obsessed South Koreans. "Lee Sedol is a national hero. South Koreans love Go, and they also love AI," Hassabis later said.

The choice of the competition time also required careful judgment. David Silver estimated that AlphaGo would be ready in March, but several members of the team hoped for some buffer time. This system occasionally had "hallucinations" and made seemingly random move choices. However, due to the threat from other AI labs, Hassabis overruled the doubters. Facebook was closely following, and the Nature cover article had revealed how AlphaGo worked, detailing the combination of the policy network, value network, and Monte Carlo tree search. Chinese Internet giants, considering the popularity of Go in their country, would also seize the opportunity presented by the Nature paper to catch up.

The support from DeepMind's parent company finally led to the decision to push forward at full speed. At the end of 2015, Aja Huang and his colleagues started running AlphaGo on a new type of hardware - a Google - developed dedicated chip that replaced Nvidia's GPUs. This chip, called the "Tensor Processing Unit" (TPU), was faster than GPUs. By rounding numbers to the nearest integer and sacrificing a little precision, it could perform trillions of additional multiplication operations. When Aja Huang tested Google's new chip, another amazing moment occurred. AlphaGo equipped with TPU had a winning rate of over 80% against AlphaGo equipped with GPUs. Fan Hui, who had joined the DeepMind team by then, said that the upgraded AlphaGo had a different playing style, and its moves were extremely creative and even exquisite.

A few weeks before the trip to South Korea for the competition, Google Chairman Eric Schmidt visited Hassabis in London. If DeepMind was going to hold an event like Deep Blue's showdown with Kasparov, Schmidt wanted to ensure victory. "How's it going?" he asked Hassabis.

"The indicators look good, but we still have some concerns," Hassabis replied. "Good, don't mess it up," Schmidt said half - jokingly.

In March 2016, Hassabis, Silver, and the team arrived in Seoul as scheduled. Eric Schmidt flew in from California, and Jeff Dean, the mastermind behind Google's TPU chips, also came. Sergey Brin, the co - founder and a Go enthusiast, joined them three days later. The grand scale of the event surprised the visitors. There were a large number of media journalists on the streets and huge screens, allowing passers - by to watch the game. More than 200 million people watched this man - machine battle, more than twice the number of viewers when Deep Blue defeated Kasparov, and even more than the number of viewers of the Super Bowl.

Silver felt a bit timid. "I underestimated the impact of this by two orders of magnitude," he said in technical terms to describe his uneasiness.

Lee Sedol seemed very confident. He studied every move of the agent's game against Fan Hui published in Nature and predicted that he would win 5 - 0 or 4 - 1 because he was much stronger than Fan Hui. Most professional Go players also agreed with this view - defeating DeepMind would be the easiest opportunity for a top - level professional Go player to earn a million dollars. "I will do my best to defend the dignity of human intelligence," Lee Sedol solemnly promised.

On the day of the competition on March 9, in a simple room, Aja Huang sat in a black leather chair in front of a Go board. On his left was a computer screen showing AlphaGo's moves, which were generated by servers on the other side of the Pacific. Opposite him sat Lee Sedol, whose moves were driven by adrenaline and coffee.

Just a few minutes into the first game, the human player was in trouble. Lee Sedol made his regular third move and immediately initiated a conflict, trying to confuse AlphaGo by deliberately using strategies not in the computer's training data. But AlphaGo seemed unmoved. Lee Sedol underestimated the progress the system might have made since the game against Fan Hui in October.

Lee Sedol's expression showed shock, amusement, and resignation at different times. He leaned back in his chair, smiled, and massaged his neck. All his expectations based on the study of the game against Fan Hui proved to be meaningless. The system could still be defeated at that time, but it would become invincible after five months.

Finally, Lee Sedol conceded defeat. "I didn't expect AlphaGo to play Go in such a perfect way," he admitted at the post - game press conference.

In the second game the next day, Lee Sedol tried a different strategy. He placed his stones carefully, waiting for AlphaGo to make a mistake. After 36 moves, he got up to smoke and then came back to study the situation.

While he was away, AlphaGo made its 37th move: a black stone was placed in an almost empty area, launching a surprise attack on Lee Sedol's right side.

Lee Sedol took a full 12 minutes to respond. He had never seen such a move before. In another room not far away, Michael Redmond, a top - ranked Western Go player in the world, was watching the game via video and broadcasting it to a global audience. He was also extremely confused. After seeing AlphaGo's move, he placed a black stone on the corresponding position on the board in front of him and then picked it up. "No, this can't be right," he murmured.

But it was exactly right. After checking the screen again, Redmond put the stone back in that strange position, trying to figure out the logic. "I really don't know if this move is good or bad," he admitted to the live - streaming audience.

It turned out to be an excellent move. At the end of the game more than 100 moves later, the 37th move proved to be decisive. "When I saw this move... I think AlphaGo must be creative," Lee Sedol said at the post - game press conference. "I really have nothing to say," he added.

The third day was a rest day. DeepMind's scientists took a walk in the city and tasted Korean barbecue. Every newspaper was reporting on AlphaGo. A young woman recognized Hassabis on the street and pretended to faint, as if Hassabis was a pop idol.

"This kind of thing happens often," Hassabis told a journalist beside him. Of course, everything had changed for AI researchers around the world. The emergence of AlphaGo ended the innocent era of anonymity and modesty in the AI field.

The next day, AI defeated Lee Sedol for the third time. The South Korean player showed some of the best Go skills in his career, but AlphaGo still outperformed him. At the press conference that day, facing rows of flashing cameras, Lee Sedol apologized to all humans. Like Fan Hui before him, he was initially confident but soon faced the reality. "I feel a bit helpless," he admitted.

What should humans do in the face of the super - intelligence of machines? One possible response is to "join if you can't beat." After losing 0 - 5, Fan Hui joined DeepMind and even said that the defeat opened up infinite possibilities in his life. "I found that the world is much bigger than I thought, and I really like this feeling," he exclaimed. This is a beautiful and humble sentiment, but it covers up the reality of human defeat. Of course, machine super - intelligence expands possibilities, but it also threatens humans in the most disturbing way, meaning that one day, human intuition and ideas will no longer matter.

Another response to super - intelligence is to continue to fight against it. In the fourth game in South Korea, Lee Sedol unexpectedly defeated AlphaGo. With his 78th move - this exquisite move was later called the "divine move" - Lee Sedol used a unique and bold strategy that put the computer on the defensive. AlphaGo's algorithm was in a state similar to a human's desperate situation, starting to make chaotic moves, having "hallucinations," and damaging its own position, showing a "human - like panic," and finally conceding defeat.

Lee Sedol celebrated this victory, saying that he felt an unparalleled sense of warmth, which meant that humans had not been conquered. Go fans cheered his name, and a computer programmer in Florida even tattooed the shapes of the 37th and 78th moves on his arm. However, this attitude of fighting against the computer and celebrating its defeat seems as feeble as Fan Hui's response of conceding defeat. Three years later, when the Go system became unprecedentedly powerful, Lee Sedol announced his retirement sadly, saying that he could no longer find joy in playing Go.

The DeepMind team themselves were not sure how to view AlphaGo's victory. AlphaGo was created by humans; it was not from some alien force but an embodiment of human initiative and curiosity. But the DeepMind team could also empathize with Lee Sedol's despair. "I can't celebrate," Hassabis recalled the situation when Lee Sedol lost 1 - 4. He knew the feeling of losing after a fierce competition.

A few years later, I asked Thore Graepel how he felt when machines surpassed humans.

"Our first - generation Go system's playing style was still similar to that of humans. It could discover many strategies that humans had summarized over thousands of years, which made us very happy," Graepel told me. "Later, it found that some long - standing human strategies could actually be countered, so it abandoned them."

"Then, as the system became more and more powerful, its playing style became something we had never seen before, forming a completely unfamiliar style. The stones it placed seemed to be randomly scattered on the board. But as the game progressed, after 30, 50, 100 moves, you would find that all these stones were connected..."

"Is it like a noose gradually tightening around the neck?" I asked nervously.

"Yes," Graepel nodded. "That's exactly it! It's like magic." Of course, it's not magic but the foresight of the algorithm. It only seems like magic to a lower - level intelligence.

"This is the future we must imagine. In the field of Go, we have achieved super - intelligence, and we can experience interacting with it. At first, it seems harmless. Then its uses completely dominate. We don't understand its operating mechanism, tactics, and strategies. We only know that the control is in its hands..."

This article is from the WeChat official account "Sequoia Capital" (ID: Sequoiacap). The author is Hong Shan. It is published by 36Kr with authorization.

该文观点仅代表作者本人，36氪平台仅提供信息存储空间服务。

The 10th anniversary of AlphaGo's match against Lee Sedol: A detailed revelation of the five-day event in Seoul