Robots can't yet replace human workers.
Humans are being "won over" by humanoid robots.
At the just - concluded 2025 World Robot Conference (WRC), these "steel warriors" became the top - tier stars. During the five - day exhibition period, the crowds never dispersed. Each booth was surrounded tightly. While exclaiming "Wow", the audience took a flurry of photos with their mobile phones, and short videos of robots flooded social platforms.
More and more people are amazed at the evolution speed of humanoid robots. They are no longer clumsy lumps of iron. Instead, they have dexterous hands and feet, and their skin feels so realistic that it can "fool the eye". They can even raise their eyebrows, smile, and give a flirtatious look.
Their skill sets are also evolving in all aspects ——
- Good at performing: They are proficient in dancing, catwalk shows, boxing, playing football, etc.;
- Capable of working: They can replace human workers in fields such as housework organization, coffee making, and industrial handling;
- Able to communicate: They can understand human speech, conduct simple natural conversations, and gradually shed the label of "artificial stupidity".
But they also have quite a few bugs ——
- Single - minded movements: The dancing, backflips, and falling postures of robots from many manufacturers seem to be copied and pasted, which has been ridiculed by netizens as "lazy programmers";
- Low efficiency: They fold clothes as slowly as sloths, and in industrial scenarios, they are still stuck at basic sorting;
- High price: A top - of - the - line robot can cost as much as a BMW.
Despite this, they are still staggering out of the laboratory and accelerating their way into the real world. Various humanoid robot competitions such as "marathons", "sports meets", and "boxing matches" are still dominating the media at home and abroad.
Recently, "Focus One" had a chat with several leading humanoid robot companies and senior practitioners. Although it's still too early to talk about large - scale applications, "intelligence" and "cost" remain bottlenecks. However, technological progress, capital investment, and market demand are accelerating this process. In the future, humanoid robots may subvert our imagination of labor, efficiency, and intelligence.
What kind of evolution have humanoid robots undergone?
What exactly are humanoid robots? They are not as simple as people imagine. Let's have a comprehensive understanding from three aspects: appearance, interaction methods, and application scenarios.
Let's first take a look at what today's humanoid robots look like.
Generally speaking, they have a human - like structure including a torso, head and neck, and limbs. In fact, the forms of humanoid robots from different manufacturers vary greatly, and the differences are mainly reflected in the design of hands and feet. Hands can be divided into three categories: dexterous hands (with a bionic five - finger design that can simulate the delicate movements of human hands), two - finger grippers, and three - finger hands. Feet are divided into bipedal and non - foot types.
In the view of practitioners, although dexterous hands and bipedal designs are closer to the human form, their functions are relatively basic and the price is high. A relevant practitioner revealed that the price of high - end dexterous hands can be as high as 100,000 - 200,000 yuan.
Kris, who has many years of experience in the Internet and autonomous driving vehicle industries and is also a senior practitioner in the field of embodied intelligence, told "Focus One" that the cost of a pair of dexterous hands can account for one - third of the total cost of the robot. Upgrading from a two - finger gripper to a three - finger hand, although only adding one finger, may multiply the cost several times, but the results brought by the high cost may not be proportional.
Therefore, in order to provide better performance and balance cost - effectiveness, most humanoid robot companies adopt less human - like gripper forms and wheeled structures, unless customers have specific requirements.
For example, Astribot S1 of Stardust Intelligence, an exhibitor, demonstrated complex tasks such as making breakfast, brewing coffee, and painting fans, and these operations were all achieved by a two - finger gripper.
An Zhaohui, the R & D leader of Stardust Intelligence, told "Focus One" that the operation functions of humanoid robots are concentrated in the upper body. It's not just the gripper but the entire upper body that matters. So they adopted an innovative rope - driven transmission design for the key parts of the entire robot body, which highly imitates the human muscles and force - applying methods. It is more anthropomorphic, more dynamic, and safer.
Now let's look at the interaction methods. Kris explained that there are mainly three methods to control the operation of humanoid robots: teleoperation (capturing human movements through devices such as sensors and controllers), isomorphic arms (transmitting movements to the robot arm through joint mapping), and voice control. For more complex instructions such as making breakfast, teleoperation and isomorphic arms are used. For simple instructions such as pushing, picking up, and putting down, voice control can be used, and in some cases, the robot can even operate autonomously.
But no matter which method is used, it is still far from being truly autonomously controlled by AI. Even the seemingly intelligent "voice control" is mostly based on preset rules. The robot just seems to have self - awareness but lacks real scene - adaptation ability.
It should be noted that the teleoperation of humanoid robots is not the same concept as remote - controlling a car and also requires a certain level of technology.
Gashero, a visiting engineer at the School of Computer Science of Peking University with rich practical experience in the Internet, autonomous driving, and robotics industries, explained to "Focus One" that although it seems that someone is operating the humanoid robot with a remote control, teleoperation actually sends instructions rather than directly controlling the lowest level of the robot. There are still a large number of subtasks that the robot needs to plan and execute on its own. For example, the robot needs to maintain its balance on its own, plan multiple motors and sensors on its body to comprehensively execute the target actions, which is technically challenging.
Finally, let's look at the application scenarios.
Based on the views of practitioners, humanoid robots can be clearly divided into To B (enterprise - level) and To C (consumer - level) directions. Among them, To B mainly refers to the four major fields of cultural and entertainment performances, industrial manufacturing, cultural and tourism services, and medical and health care. To C is concentrated in household scenarios. Kris summarized that the goal of humanoid robots is to replace the traditional "three security" jobs (security guards, cleaners, and nannies).
Kris said that cultural and entertainment performances are currently the most mature application scenario, with various dances, catwalk shows, and competitive competitions appearing frequently. Other scenarios are still in the basic application stage. For example, in industrial manufacturing, it mainly focuses on sorting and handling on the assembly line, and in cultural and tourism services, it mainly focuses on scenic - spot guidance.
Image source / Kris' Robot Awakening Notes, provided by the interviewee
However, from a practical value perspective, Gashero believes that at present, many humanoid robots don't have a strong presence after "going to work".
For example, in the task of moving boxes in a warehouse, warehouse AGV robots (a combination of machine vision and robotic arms) are already very mature and inexpensive, and humanoid robots don't have strong competitiveness. As for cultural and entertainment performances, he believes it's not sustainable. "After the novelty wears off, robots still need to pursue creating real value."
In summary, although humanoid robots have made great progress in recent years, there are still several key thresholds to cross before they can truly exert their value.
They need to pass the "cost barrier" and the "intelligence barrier"
Many practitioners summarized that the main problems of humanoid robots at present are "not smart enough" and "not affordable enough".
You can imagine a humanoid robot as a "person" composed of a "body" and a "brain". The hardware is its "body", also called the ontology by practitioners, and the software is its "brain", which controls various thoughts and actions.
It's a consensus in the industry that the motion performance of domestic humanoid robots has become increasingly mature and can meet basic operation needs. However, compared with the "strong body", the "brain" of humanoid robots has big problems, and the intelligent development of the industry is seriously uneven at present. Kris said bluntly that the software of humanoid robots is still at the Demo level, like a child who has just learned to walk and can only walk within a specific small area.
The reason why large - language models are becoming smarter is that they keep learning a vast amount of data. It's similar for humanoid robots, but they have to conduct a large number of interactive operations in the real physical environment to obtain data to train their decision - making and action abilities. The reality is that the operation data in the physical world is very scarce, which seriously limits the development of humanoid robots.
Kris said that the software structure of humanoid robots is basically a VLA architecture. Under this architecture, for the "brain" to recognize objects and command the "body" to complete actions, it must rely on accurate and real spatial data.
For example, if you ask a humanoid robot to hang out the clothes, it has to know where to go to hang them and what the specific coordinates of that place are. But in real life, this part of the data is exactly what's missing. So many humanoid robots have to be fixed in a certain place when completing specified actions, and the things they pick up must be within their line of sight, as if they were tied by an invisible rope.
However, the intelligence of some humanoid robots has evolved.
For example, in the household scenario of "tidying up the desktop", Astribot S1 of Stardust Intelligence (relying on Stardust Intelligence's full - body VLA model) can autonomously complete the task of tidying up sundries when facing many unseen objects or abnormal interferences. Even if the scenario is moved to the WRC site, only a small amount of data needs to be supplemented, and the model can still be used.
Behind this is the closed - loop drive of self - developed models, the ontology, and a large amount of past data. Its "meta - skill library" learning method allows the robot to continuously collect interaction information in various scenarios and transfer skills when facing new tasks without having to start from scratch, just like a child learning to understand the world by drawing inferences from one instance.
But An Zhaohui also told "Focus One" that currently, the general generalization ability of humanoid robots is still a headache for the entire industry. At present, it can only generalize similar scenarios and can't answer questions from all industries like ChatGPT. In short, it is still a vertical expert rather than a general expert.
More than one practitioner said that synthetic data is the key to promoting the rapid implementation of embodied intelligence. Companies represented by Galaxy Universal are focusing on research in the field of embodied intelligence and have reached the forefront of the industry in terms of the "brain".
Take Galbot of Galaxy Universal as an example. In an offline store in Zhongguancun, Haidian, Beijing, it can autonomously complete the entire process of purchase reception, ordering and payment, product picking, on - the - spot delivery, and multi - voice interaction to attract customers without human teleoperation. When facing more than 300 kinds of refrigerated/hot drinks in different forms, it can accurately grasp the products without knocking over other goods.
Zhao Yuli, the Chief Strategy Officer of Beijing Galaxy Universal Robot, revealed to "Focus One" that this relies on Galaxy Universal's self - developed world's first end - to - end embodied intelligence large model for retail —— GroceryVLA. Based on large - scale synthetic data and the Sim2Real (virtual - real fusion technology) approach, GroceryVLA doesn't need to adjust parameters for each product separately, realizing a unified grasping strategy across categories and objects, and has strong autonomous decision - making and anti - interference abilities.
At the WRC conference, Rev Lebaredian, the vice - president of NVIDIA Omniverse and simulation technology, was on stage with Wang Xingxing of Unitree Technology and Wang He, the founder of Galaxy Universal. NVIDIA announced that it had provided the first batch of Jetson Thor chips in China to Galaxy Universal. On Galaxy Universal's WRC booth, the world's first robot Galbot G1 Premium equipped with NVIDIA's Jetson Thor chip was displayed, which was evaluated by on - site audiences as "the most dexterous humanoid robot for work".
If the "intelligence barrier" limits the capabilities of humanoid robots, then the "cost barrier" determines whether they can be widely popularized.
At the WRC conference, the prices of humanoid robots vary greatly, but overall, they are quite expensive, with most in the range of hundreds of thousands of yuan. The most expensive one is nearly one million yuan, which has been ridiculed by netizens as "exclusive for the rich". Of course, there are also a few "affordable models", such as the Unitree G1 humanoid robot of Unitree Technology, which is priced at 99,000 yuan, but it's still not cheap for ordinary families.
The high price is off - putting. Practitioners explained that this is not sky - high pricing. In fact, the cost of manufacturing a humanoid robot is very high. The cost of many core components in the robot field is similar to that in the automotive industry, and even the suppliers are the same. The high hardware cost makes it difficult for robots to enter thousands of households in the short term like other household appliances.
Faster than commercialization, the wave of listing has come first
Even though humanoid robots are facing double tests of intelligence and cost, the market is still very optimistic about the prospects of this industry.
From a macro data perspective, the industry scale is growing at a high speed. At the opening ceremony of the 2025 World Robot Conference, a set of data was announced: in the first half of this year, the revenue of China's robot industry increased by 27.8% year - on - year. The production of industrial robots and service robots also increased significantly, with year - on - year growth of 35.6% and 25.5% respectively. China has been the world's largest application market for industrial robots for 12 consecutive years.
The enthusiasm at the enterprise and capital levels is even stronger.
According to Qichacha data, as of now, 152,800 robot - related enterprises have been registered in the first seven months of this year, a year - on - year increase of 43.81%, and the growth rate far exceeds the registration level of the whole of last year. As of August 12, there are 958,000 existing robot - related enterprises in China.
Meanwhile, many enterprises have set their sights on the capital market. Public information shows that currently, more than 20 humanoid robot companies around the world have launched IPOs or there are rumors of IPO intentions, among which 16 are from China, including Unitree Technology, Zhipu Robotics, and Fourier Intelligence.
Each company focuses on different application scenarios and has different advantages.
For example, Unitree Technology's advantages are concentrated in core technology and commercialization capabilities. More than 95% of its core hardware is independently developed. In terms of commercialization, its product G1 basic version is priced from 99,000 yuan, showing high cost - effectiveness, and its application has been relatively mature.