HomeArticle

From the 996 work schedule to hiking in the mountains and fields, a former senior executive at XPeng has created an AI “outdoor buddy” for themselves.

晓曦2025-07-11 10:33
A prototype with full functions will be launched by the end of this year.

On a weekend morning in the early summer of Silicon Valley, the sunlight filters through the pine needles, casting fragmented golden spots on the mountain trails. The kids, finally free to play to their hearts' content, rush forward in a flurry. Before they can be called back, they've already skipped around the corner, leaving the parents carrying camping bags behind struggling to catch up.

With great difficulty, they drag the equipment to the picnic area. Before they can even catch their breath on the picnic mat, the sound of splashing water reaches their ears. Turning their heads, they see that the kids have kicked off their canvas shoes and stepped barefoot into the stream, vigorously stirring the water with branches and waving to call their parents to join them from time to time.

In North America, outdoor activities have long been a standard part of family weekends. According to reports from the Outdoor Foundation and others, in 2023, the number of people participating in outdoor leisure activities in the United States reached 175.8 million, accounting for 57.3% of the population over 6 years old. Among them, there were 61.44 million hikers. The camping market is booming. In 2024, the market size in the United States was $20.38 billion, and it is expected to reach $71.88 billion by 2035, with an annual compound growth rate of up to 12.14%.

But parents in the middle of it may often be confused. Is this really about enjoying nature and relaxing, or is it an extreme challenge that requires taking care of "taking care of the kids + watching over the equipment + ensuring safety"? They experience both pain and joy under the heavy tasks.

Wang Tao, an entrepreneur born in the 1980s, is also one of the outdoor enthusiasts.

In 2009, Wang Tao went to Stanford University for further studies, under the tutelage of Professor Andrew Ng. Later, he started a business with his tutor and classmates and founded the artificial intelligence company Drive.ai. After the company was acquired by Apple, he joined XPeng in 2019 and served as the head of XPeng's North American visual perception team.

Having lived in the United States for many years, as a senior outdoor enthusiast, Wang Tao has experienced this scenario countless times on weekends: carrying more than ten kilograms of equipment on his shoulders and holding the excited and running kids by the hand.

This picture made him fall into deep thought. Although outdoor activities have been booming in recent years, giving rise to peripheral equipment such as positioning bracelets and smart cookers, most of these products only stay at the level of function stacking and have never fundamentally solved the core pain points of family activities: parents have to bear the heavy load and be distracted to take care of the kids, unable to fully enjoy the outdoor fun. The latest survey by Statista shows that more than 78% of parents reduce the frequency of outdoor activities due to the heavy equipment and the pressure of taking care of the kids.

"What if there is a robot that combines practical functions and interactive companionship to solve these problems?" This thought has been lingering in Wang Tao's mind for a long time. Until 2024, he keenly sensed that the opportunity window to turn the idea into reality had opened.

The iteration of consumer-grade robots follows a clear evolution logic: each round of form change is jointly affected by market demand trends, technological maturity, and cost reduction, ultimately giving rise to product forms suitable for new scenarios.

In the past year, the outdoor consumer field has witnessed explosive growth, and intelligent outdoor equipment represented by garden robots has become the focus of the market. User needs have also changed synchronously, from the basic satisfaction of a single function to the diversified pursuit of "both companionship attributes and practical value".

Wang Tao founded Shentingji, precisely targeting the outdoor companionship track. Its first product enters the market in the form of a robotic dog. On the one hand, through technological innovations such as the intelligent following system and the load-bearing module, it effectively solves outdoor pain points such as equipment handling and safety supervision; on the other hand, with the emotional interaction design, the product goes beyond the traditional tool attribute and truly becomes the user's "outdoor life partner".

It can be said that on the basis of meeting the rigid needs of North American users for practical functions, it attempts to further respond to the deep-seated expectation of the harmonious coexistence of technology and nature.

Finding a new blue ocean in the crowded market, the foundation of Shentingji's product innovation lies in accurately capturing user pain points and meeting real needs. For Chinese hardware enterprises, this in-depth exploration and differentiated expression based on scenario pain points are precisely the keys to establishing a unique voice in global competition today.

Wang Tao, CEO of Shentingji Intelligent Technology

The following is a dialogue between 36Kr and Wang Tao, CEO of Shentingji Intelligent Technology, with the content edited:

The Changing User Needs Behind the Development of Consumer Robots

36Kr: Affected by the embodied wave, as an important application market, into which key stages can the development of consumer-grade robots be divided?

Wang Tao: It can be seen that the consumer-grade robot industry has experienced three waves, and the current products are in the transition stage from the exploration period to the growth period. Each round of market explosion is directly related to the technological maturity and the decline of the cost structure. In essence, it is a process in which technological breakthroughs give rise to the exploration of new scenarios.

The first wave was represented by the rise of sweeping robots driven by functional automation, which was mainly due to the maturity of automation technology and SLAM (Simultaneous Localization and Mapping) technology. The lightweight SLAM algorithm enables reliable operation on small-computing-power chips at the edge, giving robots the ability to navigate autonomously in the home environment. At the same time, the reduction in the cost of hardware such as batteries, sensors, and motors also pushed the product price down to the mass consumption range, laying the market foundation for consumer-grade robots.

The second wave was the popular garden robots in the past two years. These robots are between outdoor and indoor scenarios, grafting some autonomous driving capabilities based on the logic of sweeping robots, such as GPS RTK high-precision positioning and visual obstacle avoidance technologies, and realizing functions such as mowing and snow removal in the semi-structured garden scenario. The European and North American markets have a relatively high acceptance of them. The products at this stage mark the expansion of consumer-grade robots from indoor to outdoor scenarios, with obvious characteristics of technological integration.

The third wave was the emergence of educational and emotional companion robots catalyzed by the breakthrough of AI technology, especially the improvement of the dialogue ability of GPT-like models. These products use physical hardware as a carrier, integrating dialogue-style AI functions into robots or plush toys. Simply put, they are traditional IP toys embedded with large language models, lacking the ability of spatial understanding and physical operation, with weak mobility and execution functions, and it is difficult to form a strong and highly sticky user usage scenario. Coupled with the fact that there are already many players entering the market at this stage, it is easy to form homogeneous competition in the future.

Whenever technology develops to a certain mature node, new products emerge. With the progress of spatial situation awareness and multimodal scenario understanding technology today, we believe that the next opportunity for consumer robots has arrived.

36Kr: From the perspective of the market and users, has it been affected by these technological trend changes?

Wang Tao: At the market level, after several rounds of market cultivation and the spread effect of social media, users have gradually accepted the trend of robots entering the family. In users' perception, robots are no longer an unattainable black technology but an accessible element of future life.

At the same time, users' needs are also changing. They are no longer only satisfied with the functionality of robots but also expect robots to have more characteristics such as being showy, interactive, and companionable. This process is not only a process in which technology promotes the continuous evolution of products but also a process in which users' needs continue to evolve.

Therefore, we firmly believe that when the next generation of consumer robots can achieve a qualitative leap in user experience, functional depth, and emotional connection, they are expected to become an indispensable part of people's lives, just like the iPhone back then.

36Kr: When did you have the idea of starting a business?

Wang Tao: I had the idea a long time ago. In 2009, I entered Stanford University for postgraduate studies and then continued my doctoral studies under the guidance of Professor Andrew Ng. In a blink of an eye, I have lived and worked in San Francisco, the United States, for 16 years.

People often joke that the Bay Area is a high-tech rural area. The smartest people in the world gather here, but there are very few cost-effective relaxation and entertainment activities. The most common activities for engineers and programmers on weekends are hiking, picking cherries, and looking at houses, which we call the "three vulgarities in the Bay Area".

I am also one of the hiking enthusiasts. On weekends, I would take my kids to the beach or the nearby mountain parks for hiking and relaxation, getting close to nature. After walking a lot, you will find that although the scenery is beautiful, it is really boring. Often at this time, I would think that if there could be an autonomously moving companion robot accompanying us, it could bear a load of 10 - 15 kilograms, help carry bags and hold drinking water, and play games with the kids in the camping area, recording the most natural and wonderful moments among family members. This would be very valuable. It would be best if its appearance could be cool and a bit cute, with a sense of contrast between its appearance and behavior, so that it could attract people's attention when walking on the trail. This is not a fantasy completely divorced from life but a real need of people in the Bay Area, especially those with kids who have worked in the technology industry for many years.

Later, I joined XPeng to participate in the R & D work of the AI robot team. During this period, I saw the possibility of turning the idea into reality. So I left my job to start a business in 2024, hoping to really make this happen.

36Kr: What key nodes made you think that the time for starting a business had come?

Wang Tao: In the past decade, I have been deeply involved in the field of autonomous driving and witnessed the earth - shattering changes in this field.

From a technical perspective, the evolution of autonomous driving technology has provided key support for the development of robots. When I was at Drive.ai, the mainstream solutions in the market mostly adopted a technical architecture based on high - precision maps, lidar, and fixed routes. However, now the industry has changed, and map - less driving has become a significant trend. Automobile companies represented by Tesla and XPeng have removed high - precision maps and lidar in urban advanced driver assistance systems. This technological breakthrough can be extended to the field of robots, greatly improving the robots' ability to understand scenarios and move autonomously.

The ecological environment of robot hardware has changed positively. A large number of companies engaged in the manufacturing of robot body hardware have emerged in China. The fierce market competition has led to a continuous reduction in hardware costs and continuous optimization of performance, and the availability of hardware has increased significantly.

Considering both aspects, robot hardware is gradually becoming commoditized and is no longer the main factor restricting its development. At the same time, the importance of software and AI has become more prominent, becoming the key to distinguishing different robot products, defining the core value of robots, and the product value.

Positioned as an "Outdoor Sidekick Robot"

36Kr: Why did you choose the outdoor scenario as the first choice?

Wang Tao: From a technical perspective, choosing the outdoor scenario was a natural choice for us.

We found that the scenarios with real implementation potential are semi - structured scenarios. Purely structured scenarios are more suitable for traditional SLAM solutions, such as the SLAM technology used by sweeping robots. These scenarios have relatively low requirements for computing power and mainly focus on achieving purely functional goals.

The outdoor scenario is an excellent entry point. It not only has social attributes but also can meet the needs of emotional companionship. However, there are many challenges at the technical level.

36Kr: What opportunities did you see in the North American market, and what are the characteristics of this scenario?

Wang Tao: From the very beginning, we were determined to take our products overseas and chose the United States as the first - launch country. As the most mature market for family leisure and outdoor sports in the world, in the United States, according to statistics, more than 100 million people often participate in activities such as camping, hiking, and picnics, and the average annual family outdoor entertainment expenditure can reach thousands of dollars. Users' willingness to pay is extremely strong.

If a product can build its brand and gain a good reputation in the North American market first, its global expansion path will be smoother. By shaping users' perception in the North American market first and then gradually spreading its influence to European, Japanese, South Korean, and high - end Chinese user groups.

In the research, we found that although the outdoor scenarios in the United States are somewhat complex, they also have the characteristic of singularity. For example, the parks they often go to are basically equipped with grasslands, trees, some parks have water bodies, and there are also small sports fields and children's play areas planned. There are also some natural landscapes, such as mountains and small state parks, which are usually composed of wild grasslands, mountains, and dirt trails.

I think these belong to semi - open scenarios. Although there are differences in their specific layouts and details, the overall combination of core elements is relatively similar. Different from autonomous driving on public roads, which needs to deal with all kinds of traffic conditions, the requirements of semi - open scenarios for robots are relatively controllable, allowing robots to better adapt to the natural environment.

36Kr: What work did you do in the early stage?

Wang Tao: The team carried out a lot of work during this period. In the cold - start stage, we needed to collect a large amount of outdoor data by ourselves. We used open - source large models for semi - automatic annotation, which significantly reduced the cost. At the same time, we used public data sets for domain transfer learning to make the model have stronger generalization ability.

There are many problems in the outdoor environment that are rarely seen in indoor scenarios. For example, the camera content may be blurred during movement, and changes in lighting such as dappled tree shadows or backlighting can interfere with the perception system.

To solve these problems, we introduced multi - frame visual fusion technology, balanced the high - dynamic range and low - exposure time, and carried out customized development from the sensor end. In terms of the deployment strategy, we adopted a self - degradation mechanism. When the perception system detects low confidence, the robot will automatically degrade and slow down, give up tracking or interacting with complex objects, and switch to a more basic operation mode to ensure that the product quality will not decline from the product design end.

36Kr: Previously, the "brain + cerebellum" architecture was the mainstream in the market. What are the differences between the edge - side "slow brain + fast brain" AI architecture you proposed and the former?

Wang Tao: In the "brain + cerebellum" architecture, the brain is responsible for thinking, and the cerebellum is responsible for motion control. However, I think this division is not very comprehensive. In reality, many thinking processes require extremely fast processing speed. The running speed of the brain is relatively slow, and it is impossible to complete certain tasks only relying on the brain. Similarly, some motion control operations cannot be simply achieved by the cerebellum.

Take playing football as an example. It requires the coordination of the eyes and the body, with very fast visual perception ability to instantly perceive the surrounding situation, such as the position of people, the position of the ball, the passable area, and obstacles, and quickly make judgments. This part of the work is often done in the mammalian brain.

In contrast, in the edge - side "slow brain + fast brain" architecture we proposed, the "slow brain" is responsible for thinking things through, while the "fast brain" is responsible for rapid response, which is similar to the cooperation between thinking and instinct when people are doing outdoor activities.

36Kr: How do they work together in actual tasks?

Wang Tao: Specifically, the "slow brain" mainly processes things that do not require millisecond - or sub - second - level responses but need in - depth understanding. It updates the internal state continuously through low - frequency operation. For example, environmental modeling, automatically identifying whether the current environment is a park, a family garden, indoor, or a public road; identifying users and their preferences, such as remembering the faces of kids and parents, and knowing who likes playing football and who likes taking pictures; understanding and decomposing tasks, converting instructions such as "get me some water" or "take care of the kids" into specific executable task chains. At the same time, the "slow brain" is also responsible for memory management, including remembering the location of the