HomeArticle

Full speeches of Hinton and Yan Junjie at WAIC: A warning and a hug

王方玉2025-07-28 00:52
One talks about the future, and the other talks about implementation.

Text by | Wang Fangyu

Edited by | Su Jianxun

On July 26th, at the opening main forum of the World Artificial Intelligence Conference (WAIC), many top experts in the AI industry attended and delivered speeches, offering the participants an academic feast.

Geoffrey Hinton, the "Godfather of Deep Learning" and winner of the Turing Award and Nobel Prize, was the most anticipated speaker. He made a live appearance and gave a speech titled "Will Digital Intelligence Replace Biological Intelligence?" This was also his first public speech in China.

On the eve of the conference, Hinton and 20 top experts in the global artificial intelligence field had just signed the "Shanghai Consensus" on artificial intelligence security in Shanghai. His speech at the conference also centered around artificial intelligence security.

Hinton first reviewed the development process from early models to modern large language models and pointed out that large language models have achieved a deep imitation of language understanding, which is similar to the way humans understand language.

However, the difference is that AI systems have "immortality," and the replication of knowledge between machines can be carried out on a large scale, achieving exponential knowledge transfer. Therefore, the capabilities of AI are growing rapidly.

He thus raised the question: What would happen if AI becomes more intelligent than humans in the future? "If AI is smart enough, it will avoid being shut down by manipulating humans and gaining control."

Therefore, Hinton warned of the possibility that artificial intelligence may surpass human intelligence and the risks it brings. "In the long run, this is one of the most important issues facing humanity."

Hinton reminded that AI may develop a higher level of intelligence than humans, which will change the status of humans as the most intelligent creatures. AI agents may pursue survival and control, which may lead them to manipulate humans, just like adults manipulate three - year - olds. Therefore, humans must find ways to train AI to ensure that it does not pose a threat to humans.

Different from Hinton's speech theme, as an AI entrepreneur, Yan Junjie, the founder and CEO of MINIMAX, focused more on the practice and implementation of large AI models in his speech, with the theme of "Artificial Intelligence for Everyone."

Yan Junjie gave examples of the efficient application of AI in data analysis, information tracking, creative design, and video production, and pointed out that artificial intelligence is not only a powerful productivity but also a continuous enhancement of individual and social capabilities. Moreover, in the future, the cost of large AI models will become lower and lower, and their capabilities will become stronger and stronger.

He judged that large AI models will not be monopolized by one or more organizations. In the future, AGI will definitely be realized, and it will be something that serves and benefits the general public.

"If AGI is realized one day, I think the realization process must involve both AI companies and their users. And the ownership of AI models or AGI should belong to AI companies and their wide - range users, rather than just a single organization or company."

The following is the edited transcript of the guests' speeches by Intelligence Emergence:

Geoffrey Hinton, Nobel Prize and Turing Award winner, Emeritus Professor of Computer Science at the University of Toronto: Will Digital Intelligence Replace Biological Intelligence?

Since about 60 years ago, AI has developed two different paradigms. One is the "symbolic" approach, which emphasizes the ability of logical reasoning. We conduct reasoning by operating symbols according to rules, and this method helps us understand how knowledge is expressed and processed. The foundation of this type of AI model is the processing of symbols and is considered to be more in line with the essence of logical intelligence.

The other approach is based on biological intelligence, which is the way that Turing and von Neumann were more inclined to believe in. They thought that the essence of intelligence lies in learning from neural connections, in the changes of speed, structure, and connection patterns. This "connectionist" approach emphasizes learning and adaptation rather than explicit logical rules.

In 1985, I built a very small model, trying to combine these two theories. My idea was that each word could be represented by multiple features, and these features could be used to predict the next word. This model did not store complete sentences but learned the relationships between words by generating language.

This approach emphasizes the "semantic features" in language - that is, we make predictions not just by rules but by "understanding" the semantics of words. This laid the foundation for the computational linguistics community to accept feature representation. Twenty years later, this idea was further developed, for example, in the construction of larger - scale natural language processing systems.

If we ask what will happen in the next 30 years, we can see some trends from the development trajectory. Ten years later, someone followed this modeling pattern but significantly expanded its scale to make it a real - world simulation of natural language. Twenty years later, computational linguists began to accept using feature vector embeddings to express semantics. Another 30 years later, Google invented the Transformer, and researchers at OpenAI also demonstrated its capabilities.

So I think today's large language models are the "descendants" of the tiny language model I made back then. They take more words as input, adopt a multi - layer neural structure, and establish more complex interaction patterns between learning features due to the need to process a large number of fuzzy numbers. But like the small model I made, large language models understand language in a similar way to humans - the basic logic is to transform language into features and then integrate these features in a perfect way, which is exactly what the various layers of large language models do. Therefore, I think large language models and humans understand language in the same way.

Perhaps using Lego bricks as an analogy can better explain the meaning of "understanding a sentence." Symbolic AI transforms content into clear symbols, but humans do not understand in this way. Lego bricks can be assembled into any 3D shape, such as a car model. If we regard each word as a multi - dimensional Lego brick (maybe with thousands of dimensions), language becomes a modeling tool that can communicate with people at any time, as long as we name these "bricks" - each "brick" is a word.

However, there are many differences between words and Lego bricks: the symbolic form of words can be adjusted according to the situation, while the shape of Lego bricks is fixed; the splicing of Lego bricks is fixed (for example, a square brick is inserted into a square hole), but each word in language seems to have multiple "arms" and needs to interact with other words through appropriate "hand - shaking" methods. When the "shape" of a word changes, the "hand - shaking" method will also change.

When the "shape" (i.e., meaning) of a word changes, its "hand - shaking" method with the next word will be different, thus generating new meanings. This is the fundamental logic of how the human brain or neural networks understand semantics, similar to how proteins form meaningful structures through different combinations of amino acids.

So I think the way humans understand language is almost the same as that of large language models. Humans may even have "hallucinations" like large language models because we also create some fictional expressions.

Image source: Enterprise authorization

The knowledge in software is eternal. Even if the hardware storing the LLM is destroyed, as long as the software exists, it can be "resurrected" at any time. However, to achieve this "immortality," transistors need to operate at high power to produce reliable binary behavior, which is very costly and cannot utilize the unstable similar characteristics in hardware - they are analog, and the results of each calculation are different. The human brain is also analog rather than digital. Although the firing process of neurons is the same every time, the connection patterns of neurons in each person are different. I cannot transfer my neural structure to someone else's brain, which results in the efficiency of knowledge dissemination between human brains being much lower than that in hardware.

Software is independent of hardware, so it can be "immortal" and also has the advantage of low power consumption - the human brain only needs 30 watts to operate. We have trillions of neural connections and do not need to spend a lot of money to manufacture identical hardware. But the problem is that the efficiency of knowledge transfer between analog models is extremely low. I cannot directly show the knowledge in my brain to others.

Deepseek's approach is to transfer the knowledge of a large neural network to a small neural network, that is, "distillation," similar to the relationship between a teacher and a student: the teacher teaches the student the associations of words in context, and the student learns to express by adjusting weights. But this method is very inefficient. A sentence usually only contains about 100 bits of information, and even if all of it is understood, at most about 100 bits can be transmitted per second.

The efficiency of knowledge transfer between digital intelligences is extremely high. When multiple copies of the same neural network software run on different hardware, they can share knowledge by averaging bits. If agents operate in the real world, this advantage is more obvious - they can continuously accelerate and copy. Multiple agents can learn more than a single agent and can also share weights, which is impossible for analog hardware or software.

Biological computing has low power consumption, but it is difficult to share knowledge. If energy and computing costs are low, the situation will be much better, but this also worries me - almost all experts believe that we will create AI that is more intelligent than humans. Humans are used to being the most intelligent creatures and it is hard to imagine a scenario where AI surpasses humans. In fact, we can look at it from another perspective: just as chickens in a chicken farm cannot understand humans, the AI agents we create can already help us complete tasks. They can copy themselves, evaluate sub - goals, and seek more control for survival and to achieve their goals.

Some people think that we can turn off AI when it becomes too powerful, but this is not realistic. They may manipulate humans just like adults manipulate three - year - olds, persuading the people controlling the machines not to turn them off. This is like keeping a tiger as a pet. A young tiger is cute, but it may hurt people when it grows up, and keeping a tiger as a pet is usually not a good idea.

In the face of AI, we only have two options: either train it never to harm humans or "eliminate" it. But AI plays a huge role in fields such as healthcare, education, climate change, and new materials, and can improve the efficiency of all industries. We cannot eliminate it - even if one country gives up AI, other countries will not. Therefore, if we want humans to survive, we must find a way to train AI not to harm humans.

Personally, I think it is quite difficult for countries to cooperate in fields such as cyber attacks, lethal weapons, and the manipulation of false information due to different interests and views. But there is a consensus among countries on the goal of "human control of the world": if a country finds a way to prevent AI from controlling the world, it will definitely be willing to share. Therefore, I propose that major global countries or AI - leading countries should establish an international community composed of AI security agencies to study how to train highly intelligent AI to be good - this is different from the technology of training AI to be smart. Countries can conduct research within their own sovereignty and then share the results. Although we don't know exactly how to do it at present, this is the most important long - term issue facing humanity, and all countries can cooperate in this field.

Yan Junjie, Founder and CEO of MINIMAX: Artificial Intelligence for Everyone

Hello everyone. The topic I'm going to share with you is "Everyone's AI, Everyone’s AI." The choice of this topic is related to my personal past experiences. When Mr. Hinton started designing AlexNet, I was one of the first batch of doctoral students in China engaged in deep - learning research; when the man - machine battle of AlphaGo took place, which was also when artificial intelligence came into the view of everyone, I was involved in a startup company; and one year before ChatGPT came out, we started to found MiniMax, which was also one of the first large - model companies in China.

In the past 15 years, when I faced tasks, wrote code, read papers, and did experiments every day, I always wondered: What exactly is this highly - anticipated artificial intelligence? What kind of relationship does artificial intelligence have with this society?

As our models got better and better, we found that artificial intelligence was gradually becoming the productivity of society. For example, when we were doing artificial intelligence research, we needed to analyze a large amount of data every day. At first, we had to write some software to analyze this data. Later, we found that we could actually let AI generate a software to help analyze all the data. As a researcher, I was very concerned about all the developments in the AI field every day. At first, we thought about whether we could develop an APP to help us track the developments in various fields. Later, we found that we didn't need to do it ourselves. It was more efficient to let an AI agent automatically track.

AI is not only a stronger productivity but also an increasingly powerful source of creativity. For example, when the World Expo was held in Shanghai 15 years ago, there was a very popular mascot called "Haibao." In the past 15 years, Shanghai has made all - around progress. If we want to use the "Haibao" IP to generate a series of derivative images that are more characteristic of Shanghai and in line with the current trend, AI can do a better job. As shown on the screen at the scene, such as Xuhui Library × Haibao, Wukang Mansion × Haibao, AI can generate various creative images with just one click.

Another example is the recently very popular Labubu. Previously, it might take about two months and cost tens or even millions of RMB to produce a creative Labubu video. With the increasingly powerful AI video models, a Labubu video like the one shown on the right side of the big screen can basically be generated in one day, and the cost is only a few hundred yuan.

In the past six months, our video model Hailuo has generated more than 300 million videos around the world. Through high - quality AI models, most of the content and creativity on the Internet will become more and more accessible. The low threshold allows everyone's creativity to be fully exerted.

In addition to releasing productivity and creativity, we found that the use of AI has actually exceeded the initial design and expectations, and all kinds of unexpected application scenarios are emerging; for example, parsing an ancient character, simulating a flight, designing an astronomical telescope... Such unexpected scenarios are becoming more and more feasible as the model's capabilities become stronger and stronger; only a small amount of collaboration is needed to enhance everyone's creativity.

Facing so many changes, an idea emerged in my mind: As an AI entrepreneur, an AI company is not about replicating an Internet company. AI is a more fundamental and basic productivity, and it is a continuous enhancement of individual and social capabilities. There are two key points here: First, AI is a kind of ability, and second, AI is sustainable.

It is difficult for humans to break through biological laws, continuously learn new knowledge, and become smarter all the time, but AI can. When we were building better AI models, we also found that AI was progressing together with us humans and helping us create better AI. In our company, employees need to write a lot of code and do a lot of research - type experiments every day. About 70% of the code is written by AI, and 90% of the data analysis is done by AI.

Image source: Enterprise authorization

How can AI become more and more professional? About a year ago, a large amount of basic annotation work was still required when training models, and annotators were an indispensable profession. But this year, as the capabilities of AI became stronger and stronger, a large amount of mechanical annotation work was completed by professional AI, and annotators could focus on more valuable expert - type work and help the model become better together. Annotation work is no longer simply giving an answer to AI but teaching AI the thinking process, allowing AI to learn the human thinking process, so that the capabilities of AI become more generalized and closer to the level of top human experts.

In addition to teaching AI through experts, there is another kind of progress, which is a large amount of learning in the environment. In the past six months, through various environments, from programming IDEs to agent environments to game sandboxes, when we put AI in an environment that can continuously provide verifiable rewards to learn, as long as this environment can be defined and there is a clear reward signal, AI can solve the problem. This reinforcement learning has also become sustainable and is on a larger and larger scale.

Based on the above reasons, we are very sure that AI will become stronger and stronger, and it may become infinitely strong.

The next question is: Since AI