XPeng is accused of "a human pretending to be a robot". Which is more difficult, "a human pretending to be a robot" or "a robot pretending to be a human"?
Just as Musk's robots were initially mocked as "human cosplay," in China, He Xiaopeng's robots have also faced similar scrutiny in recent days. They had to "dissect" one on the spot to clear their name.
The newly displayed robot by XPeng | XPeng official website
Even today, when the first household robot has started pre - sales, people still have huge doubts about humanoid robots.
After paying a $200 deposit for the so - called world's first household robot (NEO), Linlin (alias) found on Xiaohongshu that NEO actually requires manual remote operation. NEO can do some basic household chores now, but when it encounters a situation it can't handle, it needs to call the headquarters' staff. They will check the home situation through the camera and then operate it manually.
After realizing that she might spend $20,000 on a "puppet robot," Linlin took it easy: Anyway, there is a user data privacy agreement, and there are cameras everywhere these days.
"Technological progress always needs user support. I'll just support it."
Linlin is right. When new technologies are put into use, they always require a great deal of tolerance.
For example, the current market tolerance for Tesla.
A few days ago, Tesla postponed the mass - production plan of Optimus again. This is the third time the production of this robot has been "delayed." However, Tesla's stock price was hardly affected, and its market value has now exceeded $1.5 trillion. It shows that humanoid robots not only have the tolerance of users but also that of investors.
But if we know why Tesla can't mass - produce robots, we'll fall into deep thought. The reason is neither that the robot is "not smart enough" nor that it has "mobility issues." Instead, it's a more counter - intuitive problem: the hands.
The human hand has 27 bones, numerous nerves, and a real - time feedback system. It can "know" how much force to use, from which angle to grasp, and how to make fine adjustments. A robot hand, on the other hand, has to be built layer by layer with micro - motors, reducers, sensors, and algorithms. Strength, stability, and flexible control are all indispensable.
It's not difficult to make a "movable hand," but it's extremely difficult to make a "hand that doesn't make mistakes like a human hand."
This is not just Tesla's problem. Almost all humanoid robot companies are stuck with this hand problem. Then, a question that has been repeatedly raised but never truly answered emerges again: Since hotel robots, cleaning robots, and logistics robots have already achieved commercial success, why are we still so persistent in creating a robot with hands just like a human's?
What about Doraemon? Isn't he doing well?
The more human - like, the harder to mass - produce
During the just - passed Halloween, you could even play "treat or trick" with a robot.
If you're lucky enough these days, when walking on the streets of New York, you'll see Tesla's Optimus distributing candies to passers - by. They pick up a candy from a bunch and give it to passers - by. Sometimes, when the candy drops on the ground, they'll bend down to pick it up and hand it to the passer - by again.
Tesla Optimus | Tesla official website
Don't underestimate this seemingly meaningless action. It took Tesla nearly five years to make the robot stand on the street and distribute candies.
In 2021, at Tesla's AI Day, Musk first introduced the "humanoid robot" on the big screen. At that time, the robot didn't actually exist in the physical world. Musk arranged for an actor in a white robot costume to come on stage and dance. This arrangement was widely mocked at the time, but this press conference expressed Musk's initial vision for the robot concept: What Tesla wants to create is never a machine that can only repeat a mechanical action, but an intelligent agent that can understand how the world works.
One year later, the first real - version Optimus was unveiled.
It could walk, but its movements were a bit clumsy, like a newly - standing giraffe, cautious and wobbly. However, its significance was huge. This robot was electrically driven and didn't rely on a hydraulic structure, which meant it could be mass - produced in the future, costs could be reduced, and it could safely enter human living scenarios.
In 2023, Optimus started to become "smart." Tesla connected it to a visual recognition system and a neural network model that were the same as those used in autonomous driving. It could recognize objects on its own, distinguish items of different shapes, and perform basic operations such as "stabilize, pick up, and move."
Optimus also publicly demonstrated how to fold clothes, a super - complex action that requires flexible manipulation, force control, and real - time visual feedback, and was unimaginable for traditional industrial robots before.
The situation seemed promising, and there were also mass - production plans for Optimus. However, the robot started to face "production difficulties."
Optimus' mass - production plan has been postponed three times:
The first time was in 2023. Optimus was originally planned to have its "first batch of shipments" in 2024, but due to the sub - standard basic motion control algorithm, it was postponed by one year.
The second time was at the end of 2024. The mass - production target was lowered to trial - producing several thousand units in 2025.
The third time is now - Tesla was reported to have suspended the mass - production of Optimus again.
At the third - quarter earnings conference, Musk didn't avoid talking about the difficulties of mass - producing robots: We don't have a ready - made supply chain.
Comparing building a car with building a robot, if you enter a car - manufacturing factory, you'll see a highly coordinated division of labor: Motors, sensors, car lights, wiring harnesses, seats... All parts have mature suppliers, standardized interfaces, replacement options, and cost curves.
However, humanoid robots don't have such a mature production line. Cars have standards for the whole vehicle, parts, and maintenance and replacement, while almost every humanoid robot looks different, with different joint layouts, sensor positions, and motion models. This means there are no standardized interfaces, no common parts, costs can't be reduced, and manufacturing can't be scaled up. In other words, to make one million humanoid robots, you have to build an industrial chain capable of producing one million robots first.
While the supply chain is still not perfect, there are also news of changes in the robot management at Tesla.
Some time ago, Milan Kovac, the person in charge of Tesla's robot project, left the company. This core figure, who came from Boston Dynamics and led Optimus' system architecture, was regarded as "the person who understands robot motion control best" within Tesla. His departure once triggered a team restructuring. According to multiple foreign media reports, Musk then took over the project direction personally, and the R & D reporting line was transferred from the autonomous driving department to the AI chip team.
The "hand" is the real problem
The most difficult part to make on a robot is the "hand."
"To put it this way, the mass - production difficulty of a robot's dexterous hand is even higher than that of the whole robot," said Lin Wu (alias), a doctor specializing in robot research at Peking University, when describing the manufacturing difficulty of a dexterous hand.
To understand why the hand is so difficult, we first need to see how amazing the human hand is.
Our hand is composed of 27 bones, dozens of muscle groups, and numerous nerve endings working together. There are dense tactile receptors on the palm, which allows our fingers to pick up a grain of rice gently and also hold a suitcase; to sense slight changes in the temperature of a glass and also judge whether a grape is ripe based on experience.
This kind of fine control is a continuous, real - time, biological - level feedback loop: The force comes from the forearm muscles, the finger joints are controlled in a coordinated manner, the skin tactile sense provides feedback, and the brain makes fine adjustments within tens of milliseconds.
Before coming up with a better idea, currently, robots replicate the human hand by replacing all the key elements with machines.
The human finger tendons are replaced by micro - motors, the joints are driven by reducers and transmission lines, the bones become lightweight brackets, the tactile receptors become an array of force/pressure/temperature sensors, and the action prediction and feedback control performed by the brain are completed by a real - time motion model and an AI decision - making system.
This is just the first step. The real problem with the dexterous hand lies in strength, accuracy, and durability.
However, the physical space of the hand is too small.
This results in higher requirements for precision manufacturing for the motors, sensors, and reducers installed in each joint. A smaller physical space means a smaller motor, a smaller motor means a lower battery density, and it also means that the dexterous hand doesn't have enough gripping force.
"Currently, a robot weighing sixty to seventy kilograms can only pick up a heavy object of about ten kilograms," revealed the doctor. "This is far from our expectations for robots."
There are two mainstream ways to solve the gripping - force problem. One is to make "bionic muscles." By using electric muscle fibers, pneumatic artificial muscles, or hydraulic micro - tubes to simulate human tendons, the fingers can bend flexibly like a human hand. This method looks good in the laboratory, but the difficulties lie in force amplification and long - term durability, and it's still far from mass - production.
The other way is to "outsource" the force to the forearm, just like humans. Tesla, Boston Dynamics, and Figure all follow this route. They place the driving motors in the forearm, and the fingers control the joints through thin transmission lines. This way, the robot can have enough overall strength and a controllable structure, but the structure is complex and the maintenance cost is high.
"Accuracy" is another problem.
For humans, picking up a cup in front of them is an unconscious action that doesn't require any thinking. But in fact, behind this is a highly complex biological cooperation system.
The eyes first recognize the object, judge its shape, size, and material; the brain quickly estimates the distance between the cup and the hand, the trajectory of the arm stretch, and decides in an instant "how many fingers to use, with how much force, and from which direction to grip it"; when the fingers touch the cup, the pressure sensors on the skin will tell us in real - time "the force is not enough" or "you're pressing too hard and might crush it," so the hand will naturally make fine adjustments. This whole process seems casual, but in fact, it involves the synchronous work of the visual system, motor cortex, cerebellum, somatosensory system, and muscle system, and each step is so fast that we don't even have time to be aware of it.
However, robots don't have "unconscious experience."
In the robot world, picking up the same cup has to be divided into five steps: First, it has to "see" what the cup is, which requires cameras and depth sensors for object recognition.
Then, it has to calculate the position of the cup in three - dimensional space, determine where and how