At the age of 65, LeCun was dragged back to his hometown of Paris, severed ties with Zuckerberg, and exposed a mysterious AI startup.
Mark Zuckerberg verbally supports it but is reluctant to invest real money!
After working at Meta for 12 years, Turing Award winner Yann LeCun will leave the company at the end of the year.
Last month, 65-year-old LeCun announced that he would leave Meta at the end of this year to start his own business.
In his resignation letter, he said that thanks to the continuous attention and support of his colleagues, Meta will become a partner of the new company, but there are no more details.
At yesterday's Paris AI-Pulse event, LeCun said that Meta is not an investor.
LeCun's mysterious startup won't focus on ChatGPT. Instead, it aims to teach AI to understand the physical world, concentrating on what LeCun describes as advanced machine intelligence, an AI trained based on sensory information such as vision for predicting the physical world.
LLM is an AI black hole
The world's largest tech giants are pouring billions of dollars into the AI field, especially the "Large Language Models" (LLMs) that power ChatGPT, Google Gemini, and Meta Llama.
These AI giants believe that the Scaling Law is sufficient to support LLMs on the path to AGI.
However, for months, Yann LeCun has been going against the trend, stating firmly that LLMs have hit the ceiling. They perform well in text generation but don't understand the physical world, lack memory, and struggle with multi-step reasoning.
Ph.D. students shouldn't work on LLMs.
LLMs are almost obsolete.
LLMs are just token generators, belonging to System 1, without real reasoning.
Autoregressive LLMs lack four capabilities required to reach the intelligence level of humans (or even dogs). ……
In short, he seems to have lost all interest in LLMs and discards them like worn-out shoes.
Until recently, Yann LeCun still firmly believed that LLMs are the "cancer" in the AI research community.
Last month in Brooklyn, Yann LeCun said bluntly, "Indeed, LLMs are great and useful. Many people will use them, and we should invest in them."
But the problem is: "Currently, LLMs are like a black hole, sucking up all resources and attention, leaving almost no room for other fields. For the next revolution, we must take a step back and calmly think about what is missing from the current path."
These remarks are particularly thought-provoking.
For months, Meta has been spending billions of dollars to recruit an all-star lineup of LLM experts.
According to OpenAI's chief research officer, Meta has gone beyond just throwing money around. To poach talent, Zuckerberg personally brought soup to OpenAI employees, combining both interests and emotions, which can be called the Silicon Valley version of "Three Visits to the Thatched Cottage."
In essence, this is a negation of Yann LeCun's technical approach.
As Meta's chief AI scientist, Yann LeCun publicly opposes Zuckerberg.
With such a tense situation, the conflict between their ideas is obvious. No wonder LeCun is leaving Meta after 12 years.
LeCun: I've been working on world models for almost 10 years
For years, Yann LeCun has been a staunch critic of LLMs.
He always believes that simply "devouring" text from the Internet cannot produce real machine intelligence.
He thinks that autonomous machine intelligence requires a different approach: World Models.
At the general assembly of ai-Pulse, a key platform for AI research in France, Yann LeCun will jointly elaborate on this vision with Pim de Witte, the founder of General Intuition, a pioneer company in the field of world models, analyzing how world models can become the cornerstone of tomorrow's AI and the next major technological breakthrough.
Yann LeCun, Meta's chief AI scientist, Pim de Witte, CEO of General Intuition, Neil Zeghidour, chief modeling officer of Kyutai, and Xavier Niel, founder of iliad Group, discuss on the same stage
Actually, the concept of "world models" is very old.
As early as 1943, twelve years before the term "artificial intelligence" appeared, 29-year-old Scottish psychologist Kenneth Craik pondered in his monograph:
If an organism can carry a "small-scale model" of the external reality in its mind...
It can try out various possibilities, infer the best solution...
And respond in a more comprehensive, safer, and more appropriate way.
His concept of mental models or simulations foresaw the "cognitive revolution" that changed psychology in the 1950s and still dominates cognitive science today.
More importantly, it directly links cognition with computation: Craik believed that the "ability to parallel or simulate external events" is a fundamental characteristic shared by both the "nervous system" and "computing machines."
About 10 years ago, LeCun started to "recommend" to everyone: This is the path to promote AI progress.
Actually, he had been thinking about this for a longer time. But it wasn't until the NeurIPS 2016 conference that he gave a keynote speech, which was the first time he systematically and publicly said, "This is the direction we need to tackle next."
Then, after about another 5 years, he gradually realized: We can't rely solely on generative models to do this, so he started to develop a new, non-generative method called JEPA (Joint Embedding Predictive Architecture).
Later, large language models (LLMs) emerged, and they are generative.
At that time, LeCun's reaction was: "Okay, this is interesting. It's very useful for language, and of course we should study it. There will be a lot of applications."
But he firmly believes that this is not the path to human-level intelligence (or whatever you want to call it).
That is to say, long before the explosion of LLMs, he concluded that 'just scaling language models won't bring real intelligence.'
Robots are less intelligent than dogs
As humans, we tend to think that language is essential for intelligence, but that's not the case.
And the fact is a bit counterintuitive: Understanding the physical world is much more difficult than understanding language.
This may sound a bit surprising, but it's really the case.
In robotics, people realized this a long time ago.
In the late 1980s, famous roboticist Hans Moravec pointed out:
It's relatively easy to make a computer play chess as well as an adult;
But it's quite difficult or even impossible to make a computer have the perception and action abilities of a one-year-old child.
This is later known as "Moravec's paradox."
LeCun gave the latest example: The best current AI can pass the bar exam and write code. But we still don't have a robotic worker that can act like a five-year-old child.
Obviously, current AI lacks something really important.
He believes that when we think about real-world scenarios, we actually rely on "mental models," which are the representations of scenarios we manipulate in our minds. We have physical intuition. And most of these things are learned later. When humans are just a few months old as infants, they mainly learn by observing the world, accompanied by some interactions.
In the past 10 years, LeCun has been trying to replicate this human learning method:
In the first 5 years, he mostly hit dead ends;
In the next 5 years, he started to make more substantial progress, relying on non-generative architectures.
These systems can learn the structure of the real world, predict its evolution, and simulate possible scenarios.
If LLMs are just "predicting," then world models are "understanding."
If LLMs are just "reacting," then world models are "planning."
Their ability to build coherent internal representations opens the door for AI to reason, act, and interact in complex environments.
How to build world models?
At first, many people thought that after language models, the next natural step would be to add audio first and then video.
But interestingly, LeCun is not just working on "video models." He is also using video game datasets to build world models.
LeCun explained: Why is video alone not enough, and what else do we need?
First, he admits that video is very important for understanding the world. Basically, video is one of the closest representations to reality we can get.
But he prefers to compare video to a dream: Most of the time, in a dream, you can't really "interact with what you see." You're just a spectator, not a participant.
Fundamentally, human learning is highly interactive.
World models not only predict the next frame of video but also predict the distribution of all possible outcomes under different actions.
This means that in addition to video representations, you also need a large amount of action and interaction data to truly build these world models.
LeCun likes a more intuitive analogy:
LLMs are a bit like snowballs: rolling down a hill and picking up more snow along the way.
They are auto-regressive: feeding their own output back into the model to predict the next token.
They have no "perception." Their whole world is themselves, so they keep rolling and rolling, not knowing what they're about to hit when they reach the bottom of the hill.
True intelligence is more like Olaf, the snowman in the movie "Frozen." He knows there's a rock in front and spreads himself out to go around it.
The limitation of text is:
The world we perceive is extremely rich, while text is just a very small and highly compressed subset. It's an "invention to describe the world" based on our three-dimensional perception.
But for world models and agents, you must be able to interact with the environment to build a general intuition of the environment you're in.
We think that "most human knowledge is embodied in text" because a lot of what we consider knowledge has indeed been written down.
But the fact is not that all human knowledge can be well expressed through text
Most of real human knowledge actually consists of mental models and intuitions about the physical world and various situations, which don't exist directly in text form.
Human thinking happens in our brains. It doesn't operate in the form of tokens but more in the form of mental imagery and other representations.
LeCun hopes to build a system that can also achieve this.
Goodbye, LeCun! Meta won't invest
In his resignation letter, LeCun said that although he's parting ways with Meta, Meta executives like Zuckerberg support his startup project.