HomeArticle

A former XPeng executive started a business and created an outdoor companion robot for American families.

邱晓芬2026-01-14 11:05
Companion robots should not be GPT-style chat tools but should provide real connections in the physical world.

Text | Qiu Xiaofen

Editor | Su Jianxun

At CES 2026, the booth of a Chinese robotics company was crowded with people. What they were gathered around was not the common humanoid robots or quadrupedal dogs seen at CES, but a robot with two wheels, a square body, and a display screen. It weighs only 15 kilograms and is 40 centimeters tall.

It occasionally carried a ball and walked on the turf, steadily traversing various types of ground. When it noticed an audience member taking pictures of it, it would stop to interact with people, gently swaying from side to side and switching different expressions on the screen. Upon closer inspection, it was also equipped with a panoramic camera on top, recording the scenes of being surrounded in real - time.

This robot, Rovar X3, is from "Shentingji", and it is mainly designed for outdoor companionship.

Currently, this robot has not been officially launched. However, Wang Tao, the founder of "Shentingji", told "Intelligent Emergence" that the future price will be less than $5000. "It may break the price expectations of many Americans for the robot category."

△ Rovar X3 can play football and serve as a photography stand

The designer behind this new robot configuration is also bold - Wang Tao, the founder of "Shentingji", enjoys all uncertainties.

This preference is reflected in every key career decision he has made in the past. In 2015, while pursuing a doctorate in deep learning and visual perception at Stanford, he co - founded the self - driving company Drive AI, specializing in L4 autonomous driving, with his supervisor, Andrew Ng.

During the ups and downs in the self - driving industry, this company finally found a good home. In 2019, this 200 - person Silicon Valley star company was "acquired by Apple in a recruitment - style acquisition" and incorporated into Apple's car - building project, "Project Titan".

However, Wang Tao did not join Apple like others. "It's not that Apple is bad. I could see where I would be in ten years, and I resisted such a future," Wang Tao said straightforwardly.

Finally, he chose to join XPeng. Under the leadership of Wu Xinzhou, the then vice - president of autonomous driving at XPeng, Wang Tao spent three years building XPeng's visual perception team from scratch.

Whether it was entering the self - driving field in 2015 or leaving XPeng to start a robotics business in 2024, Wang Tao likes to blaze a trail when the track has not yet converged. Behind the uncertainty lies the dividend of "ten - fold growth".

△ Wang Tao, Photo source: Provided by the interviewee

Behind Wang Tao's enjoyment of uncertainty is another side of his character: being practical and focusing on implementation.

While the industry is talking about the stories of humanoid robots doing work, Wang Tao chose to start from the outdoor scene and gradually move towards the home scenario. He calls their outdoor companion robot, Rovar X3, the most promising physical AI MVP (Minimum Viable Product) to enter the home.

Wang Tao, who has lived in Silicon Valley for 16 years, told us that this product idea stems from his observation of the people around him.

In Silicon Valley, when the smartest people in the world gather together, the biggest problem is boredom. There is a saying of the "Three Common Pastimes in the Bay Area". These people basically do only three things on weekends: take their children on outdoor hikes, pick cherries in the wild, and look at houses.

Seeking outdoor companions is a real need for these Silicon Valley people. Previously, they only had two options: pet dogs or the Boston Dynamics quadrupedal dog, which costs up to $75,000.

Therefore, Wang Tao designed three main uses for Rovar X3:

First, it can act as an "outdoor companion". Its visual system can closely follow the owner by recognizing the owner's biological characteristics (face, gait, body shape). On relatively complex outdoor roads, it can not only avoid obstacles flexibly but also carry heavy loads without remote control.

Second, it can help take care of children, play hide - and - seek with them, pick up balls, and play football with humans. In addition, you can also prop up your phone and the panoramic camera on it, making Rovar X3 a shooting stand.

△ At CES 2026, Rovar X3 greeted the audience

When starting a business in the robotics field, Wang Tao admitted that there is an extra sense of excitement of "creating things" compared to the previous self - driving field. Now, robots are no longer cold automated devices but "physical AI" with autonomy and a sense of life.

However, currently, the directions of "creating things" are diverse. The product form of companion robots has obviously not converged. Many of Wang Tao's judgments about this category are also different from most people in the market.

In the design logic of Rovar X3, compared to "whether the robot can chat", he emphasizes more on "whether it has done things together". Only by letting the robot participate in real - world actions and cooperate with humans to complete tasks can long - term companionship occur.

This inspiration comes from the ninth employee of "Shentingji" - a little dog. He observed that during lunch breaks, employees would spontaneously take the dog to the lawn, throw the ball, and run. The interactions in the real world established a deep emotional connection between humans and the dog.

△ Rovar X3 can pick up balls

After establishing the connection, a further question is: How can the robot's intelligence be continuously enhanced?

Most people choose to send the robot to a factory or lock it in a specific niche scenario to let the model evolve.

Wang Tao believes that although this approach can lead to delivery, it is not conducive to the improvement of the robot's intelligence. Taking the factory scenario as an example, it is highly standardized with a low tolerance for errors.

This means that the uncertain factors and information that can help the model evolve also disappear. In contrast, GPT emerged with the support of complex and diverse data.

Tesla is a classic example. Elon Musk didn't start by implementing the project in mines or closed parks. Tesla's MVP was to run L2 assisted navigation directly on the highway. "It didn't seem very fancy at that time, but its earliest usage scenario was on public roads, which is quite similar to the final scenario of FSD. The DOMAIN (scope) of the data is relatively close."

In Wang Tao's view, the outdoor companion scenario is a natural option closer to the home, and the data requirements of the two are basically the same. Therefore, after initially gaining the trust of users, Wang Tao hopes that the robot can gradually enter the yard and then the home to collect more diverse user data.

These data will also become the proprietary data for training the operation model, deepening the robot's understanding of the world and gradually forming a "data flywheel".

Recently, "Shentingji" completed a 100 - million - yuan angel - round financing, led by BlueRun Ventures and followed by Particle Future Fund. "Intelligent Emergence" had a conversation with Wang Tao about his rarely - disclosed past and his understanding of building an embodied intelligence MVP.

The following is the transcript of the conversation (slightly edited)

Seeking "Ten - Fold Growth" in Career Choices

"Intelligent Emergence": Previously, you founded Drive.ai with Andrew Ng. After this project was acquired by Apple, you didn't join the Titan project but instead joined XPeng. Were there many people who made such a choice at that time?

Wang Tao: I started my master's degree at Stanford University in 2009, majoring in deep learning and visual perception, and later pursued a doctorate under Andrew Ng. At that time, autonomous driving suddenly became popular. Andrew Ng, his wife, and several of my fellow students co - founded Drive AI to work on L4 autonomous driving.

I was a co - founder and also the director of engineering and R & D, responsible for the PNC module, which is the planning and control module, and system integration. Drive AI was acquired by Apple in 2019. To be honest, not many people didn't join Apple after the acquisition. My choice at that time did seem a bit non - mainstream.

"Intelligent Emergence": How did you make the decision not to join Apple?

Wang Tao: First of all, for me, what's really attractive is not joining a proven company but doing something that doesn't have a standard answer yet. Apple represents completion, while what I cared more about at that time was the process of creation.

Secondly, I thought that L4 autonomous driving was a long - term field where it might take many years to see a product. As a startup, it was a wiser choice to rely on a "deep - pocketed sponsor".

In terms of technology, L4 requires global optimization. Perception, prediction, planning, the back - end data closed - loop, and the computing platform are all strongly coupled.

However, Apple, as a very typical and successful company, has a highly modular organizational structure. I noticed that each team member at Apple clearly defined their input and output interfaces and tried to minimize coupling, which is obviously the approach of a hardware company.

So, I thought that doing L4 at Apple might turn into a research project that was difficult to implement. But my style is to make things happen and put them into practice.

"Intelligent Emergence": So, implementation is very important to you.

Wang Tao: Yes.

△ Rovar X3 has different expressions

"Intelligent Emergence": You said you didn't want to join a company that had been proven by the market. Why did you choose to join XPeng later?

Wang Tao: I was recruited by Wu Xinzhou. In the first three years at XPeng, I built the visual perception team from scratch, responsible for model training, data collection and annotation, model deployment, and engineering. In fact, it was like an entrepreneurial experience at XPeng.

At the beginning of 2023, I thought that the perception and AI in autonomous driving had converged, and the pattern had been set. My career planning has always been about seeking ten - fold growth. I judged that autonomous driving was no longer the next ten - fold growth opportunity.

"Intelligent Emergence": What made you think that the time for entrepreneurship was really right?

Wang Tao: In 2023, I led the AI team at XPeng Robotics. Through on - the - ground practice, we noticed many long - standing but underestimated problems in this field.

For example, the cost of collecting and annotating real - world data is very high, and this problem still exists today. There is a huge gap between simulation and reality. The hardware platform is over - designed in pursuit of extreme motion performance. We also had frequent internal discussions about whether to use a wheeled chassis or bipedal locomotion for walking.

These problems would lead to an exponential increase in system complexity and slow down the iteration speed of intelligence. I gradually began to have an idea: If I had sufficient resources to solve these problems, how would I do it?

This kind of thinking gradually led to a judgment: To truly promote embodied intelligence forward, some new product paths are needed.

Entrepreneurship is a continuous process of thinking, not a momentary inspiration. I spent some time thinking it through. I left XPeng Robotics in January 2024 and founded the new company in April. I believe that embodied intelligence is definitely the next ten - year, ten - fold growth opportunity. But if we want to redefine a new generation of robots, robots should be autonomous, have a sense of life, and be AI - first, rather than cold automated devices. I had an impulse and excitement for creation at that time.

"Intelligent Emergence": What did you do during the three months of thinking? How did you come up with the outdoor direction?

Wang Tao: I divide robots into four major areas - mobility, manipulation, intelligence, and navigation.

I think a lot of capabilities in mobility, manipulation, and navigation can be reused from autonomous driving. Regarding intelligence, I thought it was in an early stage at that time. I was thinking about how to continuously strengthen the intelligence of the agent.

Ultimately, we hope to learn from the data flywheel methodology of Tesla or XPeng. Past experience shows that the one who can make the data flywheel work will be the winner in the trend of embodied intelligence.

Entering the Factory Is Not Good for Agent Training

"Intelligent Emergence": How did you think about how to strengthen the intelligence of the agent?

Wang Tao: First, we need to find an MVP, a scenario that can truly lead to the end - game. Take Tesla as an example. It didn't start in mines or closed parks. Its MVP was to run L2 assisted navigation on the car.

At that time, this technology wasn't very fancy. It was just adaptive cruise control and lane - keeping on the highway. But its actual usage scenario was on public roads, which is quite similar to the final scenario of FSD. The DOMAIN (scope) of the data is relatively close.

I think if robots are to reach the hundreds of millions or even billions as Musk said, they must enter the home.

Although we don't start directly in the home, we must get as close to the home as possible to gain the trust of home users and collect their data. Second, I believe the usage scenarios of robots must be diverse, not limited to a few vertical fields. We can't collect data in vertical fields.

"Intelligent Emergence": Many embodied intelligence companies are desperately looking for vertical scenarios, such as factories, but your MVP is focused on the home and outdoors. How do you view this difference?

Wang Tao: I think in terms of commercialization, ToB is a good choice. The biggest advantage of ToB is that the demand is clear. As long as the scenario is well - defined, robots have a chance to achieve stable execution and delivery.