HomeArticle

Silicon-based beautiful female model faces have become popular. After making eye contact for just 3 seconds, I can no longer see them as robots.

爱范儿2026-03-26 11:31
Well-known and respected

"If humanoid robots have a final form, it must have a head and a face. What do you think?"

On March 22nd, a 1-minute and 18-second video caused quite a stir on social media. In the video, a bionic humanoid robot with long black hair and an exquisitely beautiful face that was almost unsettling slowly turned its head.

The person who posted this video is Hu Yuhang, the founder of Shouxing Technology. He wrote on the X platform: Bionic Humanoid Robot: Origin F1 — New Skins, New Souls.

RoboHorizon magazine commented that the facial expressions of Origin F1 are "more convincing than some politicians."

This is not the first time Shouxing Technology has gained popularity. In May last year, a video of Hu Yuhang making eye contact with a robot sparked extensive discussions. But Origin F1 seems to have taken another step forward.

The human face is the oldest UI

In 1984, Apple released the Macintosh, replacing the command line with a graphical interface. In the following forty years, every revolution in computing devices has essentially been an interface revolution: the mouse, touch screen, voice, and gestures. Although the improvement of hardware performance is important, what truly changes the relationship between humans and machines has always been the change in the interaction method.

There is a "55/38/7 law" in psychology: 55% of emotional information comes from facial expressions, 38% from voice, and only 7% from text. In other words, more than half of the emotional information is transmitted through the face. To establish an emotional connection between humans and robots, a face that can express emotions is almost a necessity.

However, in the field of human faces, almost all humanoid robots have left it blank.

This is also the reason why Hu Yuhang regards the human face as a "platform." Just as iOS is not an app on the iPhone but the basic layer on which all apps run, the human face is the basic layer of the human social protocol. Eye contact builds trust, a smile conveys kindness, and a frown expresses doubt.

In this sense, the human face is the oldest operating system of humans. What Shouxing Technology wants to do is to transplant this operating system onto robots.

Next, let's see how Hu Yuhang did it through several papers.

Hu Yuhang is a doctor from the Department of Mechanical Engineering at Columbia University, studying under Professor Hod Lipson. The Lipson Laboratory is a pioneer in the field of global robot self-modeling. Since 2006, it has been exploring how to let robots learn movement by observing themselves.

In March 2024, Hu Yuhang, as the first author, published a paper on facial "co-expression" in Science Robotics.

The core idea of this paper is very ambitious: Robots should not just imitate human expressions but be able to predict the expressions humans are about to make and execute them synchronously.

The team designed a robot head named Emo, equipped with 26 actuators, covered with flexible silicone skin, and a high-resolution camera was embedded in the pupils to achieve eye contact.

The training process is divided into two steps: first, let the robot make a large number of random expressions in front of a mirror and establish a facial self-model through self-supervised learning; then let it watch videos of human faces and learn to predict the expression changes of the interlocutor.

After these two steps, the robot can smile synchronously when a human smiles, rather than imitating with a delay.

Delayed imitation seems fake, while synchronous expression makes people feel on the same page.

This paper verified the data of more than 45 human participants.

Hu Yuhang repeatedly mentioned the concept of "self-modeling" in multiple interviews. He was not satisfied with the mainstream reinforcement learning path at that time because of its weak generalization ability. He also gave an example:

If you teach a robot to play table tennis and then teach it to play badminton, it will forget the first skill after learning the second. If it learns both at the same time, its ability will be averaged.

What he wants is not to let the robot focus on a single task but to let it learn a "learning ability."

In January this year, a more advanced result appeared on the cover of Science Robotics.

This time, the focus is on lip movement. In face-to-face human communication, nearly half of the visual attention is concentrated on the lips. However, even the most advanced humanoid robots still only have simple opening and closing movements of the mouth.

Hu Yuhang's team designed a 10-degree-of-freedom lip drive mechanism, combined with a flexible silicone lip, which can cover the lip shapes corresponding to 24 consonants and 16 vowels.

At the algorithm level, they adopted a self-supervised learning pipeline based on a variational autoencoder (VAE), combined with a facial action Transformer, allowing the robot to directly infer the lip movement trajectory from the voice audio without any manually written phoneme-lip shape mapping rules.

Finally, lip-sync across 11 languages was achieved, including speaking and singing.

Professor Lipson said at that time: There will be no world where humanoid robots don't have faces in the future. Once they have faces, their eyes and lips must move correctly; otherwise, they will always stay in the uncanny valley.

Humanoid robots are more suitable for providing emotional value

Take a look at the current humanoid robot field: dexterous hands screwing screws, dancing, and riding bikes. Almost all leading companies are looking forward to replacing blue-collar workers with humanoid robots in the near future and entering the manufacturing and logistics scenarios.

However, industrial automation robots are naturally optimized for specific tasks. Three motors can complete one action with extremely high efficiency and strong stability. A humanoid robot needs two or three dozen motors to work together to do the same thing. It is more expensive, less stable, has a shorter battery life, and may even fall.

Falling is dangerous.

A machine that costs hundreds of thousands of yuan and looks like a human, doing the work that a 30,000-yuan robotic arm can do in a factory, cannot be called a technological revolution. It's more like a performance art.

The entire industry has invested a lot of resources in the word "humanoid" but has collectively failed to address the word "human."

A humanoid robot without a face can complete tasks but cannot establish a relationship. And a relationship is the ticket to the consumer market.

Shouxing Technology has chosen a completely different path. Instead of making the robot compete head-on with industrial robotic arms in terms of productivity, it is better to let it do what industrial robotic arms can never do - establish an emotional connection.

Hu Yuhang's core judgment is: In the next five years, the biggest commercial opportunity for humanoid robots lies not in productivity but in emotional value.

Humans naturally project emotions onto things that look like humans. When you see a robot fall, you will feel sorry for it; when you see robots crowded together while playing football, you will find it interesting. This projection is instinctive and does not require the robot to be truly conscious or have feelings. And the human face magnifies this projection to the extreme.

In June 2024, he founded Shouxing Technology in Shanghai with a team of less than ten people. Four months later, it received an angel round of financing, with investors including Miracle Plus, Zhipu Robotics, and Dexun Investment.

Since then, the financing rhythm has been astonishingly fast. In 2025, it completed four rounds of financing, from the Pre-A round led by China Merchants Group Venture Capital and Shenzhen Capital Group, to the A round led by Shunwei Capital, and then to the two rounds led by Ant Group...

In terms of product lines, Shouxing currently has several series.

The Elf series is a full-body bionic humanoid with 30 facial degrees of freedom, driven by brushless micro-motors to move the silicone skin.

The Origin series is more for research and display purposes. The Origin M1 is a half-body version, equipped with lip-sync and head-eye coordination capabilities. The newly launched Origin F1 is the culmination of their technology, equipped with the so-called Omni Model, achieving a deep integration of real-time facial micro-expressions and voice.

In addition, there is a more affordable Lan series, which is positioned for scenarios that require more mobility.

In December last year, Shouxing Technology and the mobile game "Back to the Past" jointly launched a bionic robot of the game character "Fang Chengyi" at the CP32pre Comic Con in Hangzhou. According to reports, thanks to the binocular vision system, he can make eye contact with the audience in front of him and make natural facial expressions and head movements through the AI bionic motion algorithm installed.

Earlier, Shouxing Technology had cooperated with "Back to the Past" to launch the robot "Sprite Xuan." Sprite Xuan later appeared at the Douyin Spring Festival Gala with a new skin and performed the original love song "Undefined Relationship."

In terms of implementation, Hu Yuhang has mentioned several directions. In the short term, there are a large number of emotionally draining jobs in life: salespeople, receptionists, and service staff. These positions essentially involve continuous depletion of human emotions. They have to keep smiling every day and patiently solve repetitive problems. He believes that in two to three years, humanoid robots can replace some of these positions.

The ultimate goal in the long run is the consumer market. He wants everyone to have a bionic robot that can provide emotional companionship. Hu Yuhang does not avoid the controversy of this goal. When an AI is always pleasing you, without conflicts or selfish motives, will it trap people in a false relationship?

He said that they will add parameters to maintain authenticity when training the robot, so that it has certain conflicts and self-expression and is not just an emotional massager. At the same time, guiding functions can be injected into the program, such as reminding you to visit your parents during festivals and suggesting that you go hiking with friends on weekends. The robot should not be possessive.

These ideas are still in the early stage. But at least one thing is right: the market for emotional needs is larger than most people think. Action figures, blind boxes, plush toys, and pets are all carriers of emotional sustenance.

Pop Mart sells more than 10 billion yuan worth of IP derivatives in a year, proving that people are willing to pay for things that have no practical functions as long as they carry some emotional value.

If a robot can respond to your emotions in a human way, its potential is obviously much greater.

The person who took the TOEFL nine times

Hu Yuhang did poorly in the college entrance examination. After entering university, he worked hard to prove himself, ranking first in his major every semester and getting full marks in all major courses. But when he decided to study abroad, he only got more than 40 points in his first TOEFL test, while the baseline was 100 points. He took the TOEFL nine times and the GRE three times within the window period before the application deadline. He finally passed the line on the last try.

When he recalled this experience, he said: I think maybe my future self helped my past self.

His past academic experience also explains a very special quality in him. He said that the quality he values most is resilience. When recruiting, he prefers to look at whether a person's experience is full of setbacks.

He especially likes people who participate in competitions, especially those from the Rob