StartseiteArtikel

New paper from Yann LeCun's team: Mimicking human intelligence in AI development leads to a dead end.

量子位2026-03-09 19:24
Humans themselves are not actually "general-purpose" either.

The Artificial General Intelligence (AGI), which the AI community has been pursuing for years, might have taken a wrong turn from the very beginning.

Turing Award winner Yann LeCun proposed in his latest paper that the future development direction of AI should not be to imitate humans but follow another path —

Superhuman Adaptable Intelligence (SAI).

In this framework, there are three key changes in the development goals of AI:

Stop using humans as the frame of reference.

Embrace specialization and achieve superhuman abilities in specific fields.

The core indicator for measuring intelligence changes from "how many skills one has" to "the speed of learning new skills."

In other words, SAI no longer pursues being "as smart as humans" but focuses on a more fundamental matter:

The speed at which the system adapts to new tasks.

It's worth mentioning that LeCun also put forward a rather interesting view in the paper: Humans themselves are actually not "general".

Our so - called "general abilities" are, to a large extent, just the result of biological evolution —

During the long evolutionary process, humans gradually acquired a set of ability combinations for survival of the fittest.

Towards Superhuman Adaptable Intelligence (SAI)

According to the definition in the paper: Superhuman Adaptable Intelligence (SAI) refers to a system:

It can surpass humans in tasks that humans can complete through rapid adaptation and also solve a large number of task areas never involved by humans.

There is a key shift here. In the past, the development logic of AI was: Use humans as the benchmark for intelligence.

As long as a machine can reach the "human level," it is considered a success, such as in the Turing test.

But the LeCun team believes that there is a problem with this thinking. There is a very straightforward sentence in the paper:

Anchoring intelligence to the human baseline is orthogonal to the path to superhuman abilities.

In other words: If the goal is only to "reach the human level," it may actually limit the development of AI.

From this perspective, what really deserves optimization is not the ability of the model to complete a certain fixed task, but:

The speed at which the system adapts to new tasks.

Because once "imitating humans" is regarded as the goal, the task space of AI is actually artificially restricted —

AI learns what humans can do, but human abilities are just the result of biological evolution and do not represent all the possibilities of intelligence.

A more reasonable path is to let AI continuously optimize around clear goals and continuously improve its abilities through self - play, evolutionary search, and large - scale simulation.

This view also echoes the view put forward by Richard S. Sutton in "The Bitter Lesson" —

What really drives the progress of AI is often not the skills of imitating humans but large - scale computing and general learning methods.

Under this framework, AI does not need to imitate humans and can directly surpass human performance in many tasks.

On the contrary, if we overly focus on the "human level," it will not only mislead the research goal but also limit the development of AI to the human - centered task space.

Why shouldn't AI imitate humans?

As an important prerequisite for moving towards superhuman intelligence, there is also a very interesting view in the paper: Humans are actually not as "general" as we think.

The human brain was not designed for mathematics, programming, or scientific research. Its original goal was only one: To survive in the primitive forest.

In other words, human intelligence is essentially a survival tool shaped by evolution.

Natural selection and evolution have optimized our abilities, making us good at visual perception and walking.

These abilities seem very "general" to us only because they are crucial for survival. Once we leave this evolutionary comfort zone, our performance in other cognitive tasks is actually not very good.

For example, in tasks such as calculating complex probabilities, high - dimensional optimization, and large - scale logical search, humans perform far worse than computers.

The most classic example is chess. Top chess players seem extremely smart among humans, but they have long had no chance of winning against computers.

This actually shows one thing: The so - called "AGI" is, to a large extent, an illusion. We just can't see our biological blind spots.

And this problem is the jagged intelligence, or the so - called Moravec’s Paradox.

To put it simply: The things that humans find easiest (such as walking and grasping objects) are actually the most difficult for computers;

While the things that humans find difficult, such as playing chess and mathematical calculations, are very easy for computers.

The reason is simple. Those "simple" abilities are actually the result of millions of years of human evolution. They are actually not simple at all; we are just used to them.

So the conclusion drawn in the paper is also very straightforward:

If future AI only replicates this "survival - type intelligent toolbox" of humans, it will be a wrong technological path.

Specialization is the norm for intelligent evolution

Since the "generality" of humans itself is a blind spot, then what is the truly correct direction?

Yann LeCun cited experiences from the fields of biology and machine learning in the paper and gave the answer:

Specialization is the norm for intelligent evolution.

From a biological perspective, specialization is the norm. In a situation where resources are limited and the environment is complex, evolution will continuously push the system to optimize in the direction of specific abilities.

AI systems actually face the same pressure.

If the tasks in a certain field require high costs, precision, and reliability, any model that fails to meet the requirements will be replaced by a more specialized system.

There are many such examples in the real world. The most typical one is AlphaFold.

This system is specifically designed for protein structure prediction. Through task - specific architectures, data, and training strategies, it has achieved a huge breakthrough and directly changed the entire biological field.

This also reflects a basic law in machine learning: The success of an algorithm often comes from its match with the problem structure (target distribution).

If a model has to fold clothes, drive a car, write code, and predict protein structures, it is likely to only achieve "mediocre" results in all tasks.

This is why LeCun said in the paper:

The AI that helps us fold proteins should not be the same AI that helps us fold clothes!

There is a classic phenomenon in machine learning: Negative Transfer.

When multiple tasks compete for the same set of model capacity, their gradients may conflict with each other, thus dragging down the performance.

Therefore, from an engineering and theoretical perspective: Forcibly pursuing generality is often an inefficient path.

LeCun's answer: SSL + World Model

So the question is, if we don't pursue AGI, how should we pursue SAI?

The technical path given by the LeCun team consists of three keywords:

Self - Supervised Learning + World Model + Modular System.

First is Self - Supervised Learning. This method does not rely on human annotation but learns the underlying structure from a large amount of real - world data.

Second is World Models. That is, let AI build an "internal simulator of the world" inside, just like humans: predict the future, make plans, and simulate the results of actions in the mind.

In this way, the system can complete new tasks without explicit training.

Finally, there is the modular architecture. The paper clearly opposes the view that there is a "one - size - fits - all" model architecture, especially the next - token prediction in the autoregressive paradigm.

In the future, AI is more likely to be a series of collaborative systems rather than a universal model.

Author Introduction

The first author of this paper is Judah Goldfeder, a doctoral student from Columbia University, who studies under Professor Hod Lipson.

Previously, he had interned at institutions such as Google, Twitter, and Meta. His research interests mainly focus on reinforcement learning, algorithmic game theory, multi - agent artificial intelligence, unsupervised representation learning, multi - task learning, and geometric learning.

The other authors of the paper also include Philippe Wyder, who also studies under Professor Hod Lipson, and Turing Award winner Yann LeCun.

In addition, there is also Ravid Shwartz - Ziv, an assistant professor and Faculty Fellow from the Center for Data Science at New York University, in the author team.

He is mainly engaged in cutting - edge research in artificial intelligence, focusing on large language models (LLMs) and their applications.

Reference Links

[1]https://x.com/rohanpaul_ai/status/2029533545161740321

[2]https://arxiv.org/abs/2602.23643

This article is from the WeChat official account "QbitAI". Author: Focus on cutting - edge technology. Republished by 36Kr with permission.