HomeArticle

CEO's Insights · Project X | How fun is it when robots learn facial expression management?

CEO锦囊2025-08-14 15:28
Find something you think is cool and keep doing it. First, make yourself believe in it, and then make everyone else believe in it.

"CEO's Secret Tips · Project X" Launches with a Cool Vibe! A dedicated launch event for tech products, where the CEO steps onto the stage with top-notch offerings. We don't engage in empty talk; we only discuss those well - formed and incredibly amazing tech products.

The era of "emotional expression" in AI interaction has arrived. What kind of robot is most like a "human"? Is it one that can talk eloquently or one that can understand your emotions? As AI evolves from a text - based intelligence to embodied expression, and from hardware - centric to emotion - interactive, how will it change our way of companionship and interaction experience? On August 7th, at 19:00, "CEO's Secret Tips · X Series" invites Cao Rongyun, the founder & CEO of Lunwen Technology, and Guo Hao, a director at Yunxiu Capital, to let you experience how fun a robot with "emotion management" can be.

This live - stream mainly focuses on the following questions:

  1. How did the two of you first meet? What do you think of the potential of academic - based startup teams?
  2. How are the expressions of the Anni robot realized? Which AI products do you think are promising?
  3. How do you evaluate the product form direction, price range, and future competitive landscape of such products?
  4. What do you think of the technical difficulties and application scenarios of expression - head products? How did Dr. Cao start his entrepreneurial journey? Why did you choose the gray skin?
  5. In your opinion, what are the core advantages of the future winners in this field? How to judge a genuine AI project?
  6. Last year, some people in the industry said that humanoid robots would not be commercialized for at least 10 years. What do you think of the situation this year?
  7. How big do you think the future emotional companionship market will be? What suggestions do you have for friends who are starting businesses in the field of embodied intelligence?

The following is a conversation between the guests and 36Kr, with some content edited:

36Kr: How did the two of you first meet? What do you think of the potential of academic - based startup teams?

Guo Hao: We first saw Dr. Cao's product on the Keda video account. At that time, it was just an initial version of the Demo, simply imitating some facial expressions. We thought it was very interesting and quickly contacted Dr. Cao, hoping to participate in and support the team's enterprise incubation. Yunxiu led the seed - round investment and assisted the team in gradually developing from a small - scale startup to a commercially capable enterprise. Working together to promote the project was a very meaningful process.

Cao Rongyun: I was introduced by my senior brother and had a barbecue with Neo. There's nothing that a barbecue can't solve. Our team met at the Robotics Laboratory of the School of Computer Science at the University of Science and Technology of China. Everyone is a jack - of - all - trades. In addition to hardware design and software algorithms, from screwing screws to making the silicone molds for the robot's face, we basically do everything in the laboratory. Since robot R & D is a multi - disciplinary field, one needs to have a good understanding of various aspects, big and small. I think everyone has a very solid foundation, and since we've worked together for a long time, we also have a strong foundation of trust.

36Kr: How are the expressions of the Anni robot realized? Which AI products do you think are promising?

Cao Rongyun: First, it's achieved through three levels. The first is the hardware level. You need to build the robot first to have the ability to create expressions. Our expression head has the most motors, providing core support for complex expressions. The second is the task level. This part of the model determines what expressions the robot should make in different situations. The third is the execution level, such as how to make a "happy" expression.

Regarding AI products, I think it's very important to improve productivity or creativity, and a high - quality life experience is equally important. The satisfaction brought by emotional value also has practical significance. Recently, I was deeply impressed by an AI game product, "Whispers of the Stars" by Cai Haoyu. The intellectual cost behind it is the key. In the current wave of large - scale models, most interactive products are inevitably limited by turn - based and round - based interactions, which are different from the real - time and continuous natural interactions between people. This technical problem is difficult to solve. However, this game cleverly overcomes the technical shortcomings. For example, it sets the story background as an inter - planetary interaction between humans and characters on an alien planet. By taking advantage of the "bug" of the limited speed of light in the physical world, it rationalizes the delay and turn - based feeling of AI interaction, making up for the technical defects with the scenario setting. This design is very smart, and I can see that any good product must involve a large amount of intellectual cost.

Guo Hao: AI products are not limited to humanoid robots. The core lies in whether they can truly save effort and solve practical problems. In terms of software, AI Agent - assisted programming tools are already quite mature. A software company of my friend has significantly improved its development efficiency and reduced costs by using them, showing obvious progress. In terms of hardware, the recording product of Plaud.AI is very practical. Companion products have relatively low practicality and demand, and their prices are relatively high. However, there are also some promising ones. For example, the desktop robot - like products made with Espressif ESP32 have a BOM cost of only one or two hundred yuan, and they have also launched relevant kits, which have good prospects. Therefore, I think cost reduction is the key for such products to be launched and accepted in the market.

36Kr: How do you evaluate the product form direction, price range, and future competitive landscape of such products?

Cao Rongyun: First, considering the "uncanny valley" problem of expression - head products, our ToC products will tend to take a more cartoonish route. People's imagination of living things has been fully reflected in movies and animations. For example, the cars in "Cars" or the toys, cats, and dogs in other works all have vivid expressions, which are very different from the real world. So, we are exploring bringing such images from the alternate dimension, movies, and games into the real world, so that people can see such vivid existences in reality.

Moreover, I think the market for interactive products will be diversified in the future. Since companion products are strongly related to content, such as short - video platforms and games in daily life, they accompany you in your daily life and share time with you. The game industry is a good example. Large companies can develop 3A masterpieces with their manpower and resources, while many creative studios or even individuals can also produce excellent works. So, I think the future market will be very interesting.

Guo Hao: First, several products attracted a lot of attention at this year's CES. For example, Mirumi, a product that can be hung on a bag, move, and give feedback, and there are many similar products at different price points, with a wide variety of categories. However, whether these companion products can achieve good and sustainable actual sales still needs to be verified by the market.

Second, companion toys are still basically "toys". Their pricing can refer to half of the price of mainstream game consoles. For example, the Nintendo Switch 2 is priced at $499 and has good sales. Relying on the fan effect and cohesion, other products are hard to match. Half of its price is about $220. Therefore, if companion toys are priced below $200 and are highly practical, they may have a better market. If the price is too high, it will be more difficult for consumers to afford. Of course, this is also affected by many factors. For example, the Nintendo Switch 2 is popular because of the support of Nintendo's IP. If companion toys are supported by high - quality IPs, they can also get a price premium.

Finally, it's difficult for these hardware products to form a high degree of market concentration. On the one hand, the technical thresholds are scattered across multiple technical points. Different product forms require completely different technology stacks. On the other hand, hardware is very different from software, and the network effect is relatively weak. So, large companies will participate and produce in large quantities, and small startups will also keep trying new products. A popular product may grow into an excellent enterprise. It can be seen that the market will present a situation where many players compete. Internet giants, established companies with high - quality IPs, and startups will all participate, and the competitive landscape will be difficult to change in the short term.

36Kr: What do you think of the technical difficulties and application scenarios of expression - head products? How did Dr. Cao start his entrepreneurial journey? Why did you choose the gray skin?

Guo Hao: First, from a technical perspective, 2023 is a good time. The emergence of large language models has brought about many possibilities, which is very valuable for the interaction field. However, from the perspective of the capital market, it's not the best time. From 2020 to 2021, the primary market was very hot, with active financing and listings. After that, the difficulty has increased, and there are more challenges in all aspects.

Second, currently, there are two popular directions in artificial intelligence: AI Agent on the one hand and embodied intelligence (Robotics AI) on the other. One question that investors always ask is: "What are the commercialization scenarios? How to implement them?" When starting a business in this direction, you are essentially looking for commercialization scenarios. Everyone has the "hammer" ready and is just looking for the "nail" to hit. So, the entrepreneurial process is also a process of finding the answer to this question. As an FA, we can only tell our investor friends what the possible scenarios we envision are, and Lunwen Technology is also trying to implement them.

Finally, some benchmark companies in the industry have already had practices in application scenarios. For example, the Ameca robot of the British company Engineered Arts had a successful interactive performance with models at the Milan Fashion Week last year, indicating that robots have application scenarios in the entertainment industry. In China, scenarios such as education, guidance, and sales promotion are also expected to use robots.

Cao Rongyun: In terms of our team experience, in the early years, we made many practical robots in the laboratory that could grab objects and complete tasks. However, in 2015 and 2016, we found it very difficult to make these robots truly practical. At that time, there was no large - scale model to assist in task planning, and the generalization ability was poor. So, I turned my attention to the interaction field. This direction has lower costs and a higher tolerance for errors, but it also has its challenges. So, in 2022, our core team decided to start a business and received support from the school's innovation and entrepreneurship fund. In 2023, we thought the time was right. To enter the market competition and face more severe tests, we established a company. Transforming from a student in the ivory tower to an entrepreneur is full of challenges but also interesting. I hope to grow as quickly as possible through this process. Second, in terms of the value of this thing itself, in human - to - human interactions, non - verbal interactions such as facial expressions, gestures, and distance control account for more than 50% and are often ignored. Since robots are human - shaped, they should learn from more natural human - like interaction methods, and facial expressions are the most critical part of non - verbal interactions. So, we started with the expression head.

The difficulties in expression interaction are reflected in the following aspects. The first is the hardware. The upper limit of expressions depends on the hardware. The human face has 42 complex muscles to drive expressions. The robot uses motors to simulate the working principle of muscles and drive the elastic skin. Second, two models work together to determine when to make what expressions and emotions. One is a reflex - type model, similar to the instinctive nervous expressions and postures that appear when a person is nervous, which don't require brain reasoning and are real - time and more fundamental reactions. The other is similar to the reasoning process of the brain, with the large - scale model responsible for logical judgment and reasoning.

Finally, many people are curious about why we chose the gray face. On the one hand, it's to avoid the uncanny valley effect and reduce discomfort. On the other hand, we think robots shouldn't look exactly the same as humans, otherwise, ethical problems may arise in the future. We hope to explore a universal face that is irrelevant to gender and race and can be accepted by the whole world. The current gray face is a step in this exploration process.

36Kr: In your opinion, what are the core advantages of the future winners in this field? How to judge a genuine AI project?

Guo Hao: First, when judging hardware - related products, you need to pay attention to whether the underlying technology is distinctive and difficult to copy. For example, the expression - generation technology combines mechanical structures, software algorithms, and generative expression - driving. There are many core technologies (so - called Know - How) accumulated over time, which are difficult to crack or copy in the short term. This is the value of the project.

Currently, many software - oriented products want to be associated with AI. You can judge from the following aspects:

First, the algorithm and model capabilities. The key lies in whether it has the ability to optimize algorithms independently, rather than just using public APIs or adjusting prompts to optimize the product. The difference between the two is huge. Algorithm optimization ability is the foundation for forming core competitiveness.

Second, the data accumulation ability. For AI - related vertical fields, data is crucial. If there is no ability to generate new data and only relies on public databases or web - crawlers to obtain data, it usually lacks barriers and is not essentially different from past practices.

Third, the product implementation and evolution ability. This includes the application scenarios of software products, the real - time feedback, and the evolution speed, which are all key factors to consider.

Cao Rongyun: First, from the perspective of the underlying technology and product modality, the output modalities of large - scale models and some software - based AI products and AI hardware products are relatively single. We want to make the output modalities more diverse and interesting. This is similar to autonomous driving. Take our Anni as an example. The input is the audio - visual information in front of the robot, and the output is directly reflected in the robot's expressions, actions, etc., just like a car's input is the road environment information and the output is the steering, throttle, and brake control, which are ultimately reflected in the control variables of the hardware itself. For interactive products, the core advantage is how to achieve a truly vivid interactive experience through the hardware platform and the underlying algorithms, rather than just putting a large - scale model in a shell.

Second, domestic startups in the field of embodied intelligence have advantages. Take interactive humanoid robots as an example. The Ameca robot of the British company Engineered Arts is still a very excellent interactive humanoid robot product and is our benchmark. However, I think when it comes to making robots and hardware - software integrated products in China, relying on the breadth, depth, speed of the domestic supply chain, and the talent advantage, it's very likely that extremely amazing products will be born in the future, and the development will be faster.

36Kr: Last year, some people in the industry said that humanoid robots would not be commercialized for at least 10 years. What do you think of the situation this year?

Guo Hao: First, my view on humanoid robots is the same as that on companion robots. The core problem for humanoid robots is to solve "why they must be human - shaped" and it's difficult to judge when the inflection point will come. For example, in industrial scenarios, most needs can be met by lower - cost robotic arms, and there's no need for humanoid robots. In the past 20 years, the process of mechanical replacement of humans has not relied on the human - shaped form. From this perspective, for humanoid robots to reach the inflection point, two conditions need to be met:

First, find suitable application scenarios, which are still being explored.

Second, continuously reduce costs. The price is inversely related to the penetration rate. The lower the price, the higher the penetration rate. For example, the price of Unitree's R1 has dropped to 39,900 yuan, with a decline comparable to Moore's Law in the chip field. When the cost of humanoid robots is close to or lower than hiring employees and they are easier to manage, they may see an explosion in application.

The key to cost reduction lies in mass production and a mature supply chain. The larger the production volume, the lower the procurement cost, which in turn promotes larger - scale production, forming a positive cycle. Currently, there are already sporadic scenarios to support the basic demand. For example, Unitree has good sales in the education and research field by supplying schools and research laboratories, meeting the needs for debugging and teaching. Its performance is outstanding, and the cost has also decreased accordingly. A large production volume is an important factor in industrialization. Currently, many non - standard parts are made by 3D printing, which not only has a long cycle, high cost, and high installation requirements but also has various small problems such as snaps not fitting properly. After industrialization, mass production can be carried out, and there is a large space for cost reduction in the industry. As long as the production volume increases, the cost can be significantly reduced.

In addition, from the perspective of application