Warning from a top expert at UC Berkeley: Humans only have five years left for jobs they can do!
The five - year countdown has begun. Sergey Levine, a top expert from UC Berkeley, said bluntly: Robots will soon enter the real world, taking over not only kitchens and living rooms, but also factories, warehouses, and even data center construction. The real revolution is that once the "self - evolving flywheel" starts, it won't stop.
In five years, you may not need to do folding clothes, cooking, or mopping the floor by yourself!
Sergey Levine, a professor at UC Berkeley and a top expert in robotics, predicted that by 2030, robots will be able to manage an entire household independently, just like domestic helpers.
This is not just a flashy demonstration, but a signal that the "self - evolving flywheel" is about to start.
Household chores are just the beginning. The bigger shock is that the blue - collar economy, manufacturing, and even data center construction will all be rewritten by the wave of robots.
Five - year countdown: When will the flywheel truly start?
When Sergey Levine made the prediction of "a median of five years" in a podcast, many people thought it was science fiction.
However, this is not a groundless claim, but is based on the continuous accumulation of Robot Foundation Models, real - world deployments, and practical feedback in recent years.
Meanwhile, the π0.5 model of Physical Intelligence has enabled robots to complete complex and scalable household chores such as "cleaning the kitchen or bedroom" in unfamiliar home environments.
Illustration of co - training tasks in the π (0.5) recipe, including various robotic data sources from multiple different robot types, as well as multimodal data containing high - level subtask instructions, commands, and data from the web.
These advancements are different from demonstration videos. They represent visible practical capabilities. For example, actions such as a robot taking clothes from a laundry basket, clearing a table full of dishes, folding clothes, and assembling boxes are all achieved by modular models + vision - language - action networks.
Levine also emphasized:
What truly marks the start of this flywheel is not building a seemingly powerful robot, but a robot's ability to perform a task that people are willing to pay for well in a real - world household.
Once this threshold is crossed, each practical operation will generate data, and each feedback will drive improvements, and the flywheel will truly start turning.
Moreover, this is not a distant fantasy.
The research team at UC Berkeley recently demonstrated that robots can learn to assemble motherboards and even assemble IKEA furniture in one or two hours of real - world operation.
Although the efficiency still needs to be improved, it means that the mechanism of "learning to do things" is already operating in reality.
Autonomous driving is struggling, but robots are accelerating their implementation
When many people hear about "household robots", their first reaction is: Since autonomous driving is not yet widespread, how can robots be implemented faster? However, Sergey Levine believes that robots may be implemented faster.
The reason lies in the cycle of "make mistakes - correct - learn".
When folding clothes, clearing dishes, or cooking at home, even if a robot makes a mistake, it can usually be quickly corrected, and the robot can learn from the experience.
Driving on the road is completely different. One mistake can be a disaster.
This means that robots in household scenarios can accumulate data and feedback more frequently and safely, and naturally learn faster.
Another advantage is common sense and intuitive perception.
In a household environment, although robots face clutter, obstructions, and various items, the overall situation is still controllable.
In contrast, autonomous driving needs to handle high - speed movement, complex traffic, and unexpected situations, and each decision is related to public safety, so the threshold is higher.
As a researcher at MIT said in a comment this year:
If reasoning and common sense are added to robot perception, the role they can play in the real world will far exceed our imagination.
Levine especially emphasized that the real key is not to build a universal robot, but to make it perform a task that people are willing to pay for well enough in the real world.
Once this threshold is crossed, it can start working and continuously improve on the job, and then expand to more tasks.
This is also the fundamental reason why he believes that the "robot flywheel" may start earlier than autonomous driving.
Technological breakthroughs are not only reflected in the faster implementation pace but also come from the reconstruction of the underlying models.
Technological foundation: VLA model and emergent capabilities
What enables robots to move from demonstrations to real - world household tasks is not one or two hard - coded instructions, but a new underlying architecture - VLA model.
Sergey Levine proposed the concept of the VLA - Vision, Language, Action model in a podcast.
The vision module captures the environment like eyes, the language module understands instructions and plans steps, and the action decoder is like the "motor cortex", converting abstract plans into continuous and precise operations.
Different from large language models that only need to generate discrete text, robots need to handle continuous actions.
Levine revealed that they adopted methods such as flow matching and diffusion to achieve high - frequency fine control.
These technologies enable robots not only to perform single tasks such as "folding one piece of clothing" but also to complete complex action sequences continuously.
Even more surprisingly, as the scale expands, robots demonstrate emergent capabilities.
In an experiment, when a robot accidentally picked up two pieces of clothing, it first tried to fold the first one. Finding the other one in the way, it would actively put the extra clothing back into the basket and then continue to fold the one in its hand.
When a shopping bag accidentally fell over, it would also "spontaneously" right the bag. These details were not written into the training data but appeared naturally in real - world operations.
A similar phenomenon also occurred in Stanford's Vocal Sandbox project.
Researchers found that in the task of packing gift bags, a robot could piece together low - level actions such as "picking up a toy car", "moving to the gift bag", and "putting it down" to complete a new composite task.
This shows that when vision, language, and action truly cooperate, robots can combine existing skills like Lego bricks to handle complex scenarios.
This is the significance of VLA: It is not only an architecture but also a path to "embodied intelligence".
Robots are no longer just mechanical arms but "learning assistants" that can gradually accumulate experience and learn to adapt.
From household chores to industries: Expansion and economic impact
Household chores are just the starting point. Next are scenarios such as warehouses, factories, and data centers.
Levine mentioned a logic in a podcast:
If you can make a good cup of coffee, you can move towards opening a coffee shop.
This is not just a metaphor but his path of ability expansion: First, perform a real - world task to people's satisfaction. Then, the steps will become more and more numerous and complex, and the deployment will become larger and larger.
The economic path is also clear. Robots will first "partner with humans", replacing humans in repetitive physical work and routine operations, so that humans can focus more on emergency judgment and creative tasks.
In the past 30 years, the cost of robots has decreased by more than 50%.
In the report "Automation and the Talent Challenge in US Manufacturing" by McKinsey, it is pointed out that routine and repetitive activities are most likely to be automated. Once these links are replaced by automation, efficiency and the yield rate often increase significantly.
Multiple industries are being transformed, and robots are entering fields such as "manufacturing / warehousing / assembly".
The hardware cost is decreasing, and the algorithms are becoming more and more accurate.
In the past, a research - grade robot might cost extremely high. But when the hardware is mass - produced, materials and components are standardized, and combined with the algorithms of the vision - language - action model, the "usability" cost of robots is reduced.
The lower threshold in household scenarios also allows more startup teams or small and medium - sized enterprises to participate in the deployment, thus forming a scale effect.
When these factors are combined, the economic impact will be significant.
On the one hand, it will release corporate costs and productivity; on the other hand, it will reshape the labor market, value chain, and even social structure.
Positions such as warehousing, packaging, and equipment inspection, which originally required a large amount of manual labor, are most likely to be the first scenarios to be widely replaced by robots.
When robots truly enter households, factories, and construction sites, what we face is not only an increase in efficiency but also a profound adjustment of social structure.
In the short term, the partnership mode between humans and machines will bring huge dividends; in the long term, full automation may reshape the patterns of labor, education, and wealth distribution.
As Sergey Levine said,
What really matters is not the end point of a certain year, but when the flywheel starts to turn.
Once it starts, the speed will far exceed our intuition.
The next five years may be the window period that determines the pattern for the next few decades.
Reference materials:
https://www.dwarkesh.com/p/sergey-levine
This article is from the WeChat official account "New Intelligence Yuan". The author is Qingqing. It is published by 36Kr with authorization.