StartseiteArtikel

Robotereinführung erfordert einen Kindergarten, in dem man "Fehler" machen kann.

36氪产业创新2026-05-26 18:29
Wie lässt sich die "Kindergarten", die von Sutton und Tashan Technology gemeinsam aufgebaut wurde, Roboter dazu bringen, autonom und kontinuierlich zu lernen?

In 2024, Richard Sutton, the founder of Reinforcement Learning, received the Turing Award together with his mentor Andrew Barto.

This award is not overdue. In the last thirty years, Sutton's theory has supported the evolution of systems like AlphaGo and ChatGPT. But only now, thirty years after the publication of his theory, is it really understood by the Embodied Intelligence industry:

Agents must learn from mistakes and develop from real experiences.

In 2023, Sutton founded the non - profit research institution Openmind. In April 2025, Sutton stated aptly again in a co - written article titled "Welcome to the Era of Experience":

"New generations of agents must have a continuous stream of experiences like humans over a long period and develop themselves in real physical feedback."

This time, Sutton has cast his gaze beyond the theory.

In May of this year, Sutton signed a contract with Tashan Technology in Canada to promote a project called "Robot Kindergarten" in the form of long - term cooperation.

A Turing Award winner and a Chinese haptic technology company have reached an agreement and jointly made a prediction for the next decade of Embodied Intelligence: The new way for robot training could lie in real touches and trial - and - error attempts.

Embodied Intelligence lacks "first - person experience"

Ma Yang, the CEO of Tashan Technology, has given a very clear assessment. If robots are to work, they must solve two problems: One problem is the movement of the robot in the physical world, e.g., via two legs, four legs, or wheels. Many companies are working on this problem.

The other problem is the manipulation of objects, e.g., grasping, placing, and turning. The actions must be smooth and not interrupted by the deviations of the previous action. Together, these two things can cover about 90% - 95% of the tasks that people currently want to delegate to robots.

From the beginning, Tashan Technology wanted to solve the second problem from the perspective of haptics.

When Tashan Technology was founded in 2017, most robot manufacturers were working on mobile platforms and demonstrating abilities such as running, jumping, and rolling. However, over 90% of human physical interactions are actually carried out with the fingers. Fingers are not like legs. They must constantly be in contact with different objects to sense, decide, and adapt. This is a difficult and continuous process.

Solving the "finger problem" in Embodied Intelligence is a central variable and the fundamental methodology for getting robots to work. Tashan Technology has been working on this path for almost ten years.

The mainstream training direction of Embodied Intelligence is based on end - to - end imitation in static datasets, similar to doing exercise tasks. The data demonstrated by humans are essentially second - person experiences. Robots learn from human actions but cannot "touch" on their own and therefore cannot understand how the physical world works.

Tashan Technology recognized the problems of this route early on: Just as humans in their childhood need to learn from imitation and practice, the "enlightenment" training of robots requires not only imitation but also first - person experiences of their own.

The training in which the effects during the action are captured and the behavior is adjusted in the feedback is perhaps the method closest to the "self - training" ability of Embodied Intelligence.

This assessment is in line with Sutton's thoughts.

The concept of the "stream of experience" proposed by Sutton requires that the learning process and the action process of agents be fully integrated. Every action is a data collection, and every feedback is a training signal. Therefore, the real environment that provides first - person experiences is the key to implementing this concept.

However, it remained at the theoretical level for a long time because the real physical environment cannot provide cost - effective, high - frequency, and standardized interaction feedback. For a long time, the Embodied Intelligence industry has focused on solving the problems of the brain and the eyes and lacks a channel that can precisely capture physical touches.

Haptics is the most central perception channel in physical interaction. When a robot touches an object, the haptic sensor can capture in real - time the three - dimensional force distribution at the contact point, the local deformation of the object, and the sliding tendency. With this information, the robot can quickly adjust the force and the angle and decide whether to hold on or let go.

With the development of high - precision haptic perception technology, robots no longer lack the "input nervous system". Theorists like Sutton also begin to focus on this field. In November 2025, Sutton visited China and actively contacted two Embodied Intelligence companies, including Tashan Technology.

Sutton visits Tashan Technology

Tashan Technology is the company with the most comprehensive technological equipment in the field of haptic perception.

The haptic sensor independently developed by Tashan Technology has a force resolution of 0.01 N, which is "similar to the force when a hair falls on the finger". Thanks to years of research and development in AI haptic perception technology and the holistic haptic solution, Tashan Technology has solved the global technological problem of simultaneously analyzing multi - dimensional haptic perception signals and built a complete technological concept from "chip - sensor - algorithm model - application scenario".

While most haptic sensor manufacturers still remain at one - sided force measurement or simple capacitance change, Tashan Technology has achieved the simultaneous analysis of three - dimensional force, material recognition, proximity perception, and coordinated perception.

More importantly, Tashan Technology has brought haptic perception ability into mass production. In the last two years, its products have entered the commercial phase and are being serially supplied to leading manufacturers of dexterous hands. In 2025, Tashan Technology captured more than 80% of the market for haptic sensors in humanoid robots.

TS - VT Visuohaptic Training System

After his visit to Tashan Technology, Sutton quickly promoted the cooperation. This is not only because of the agreement in methodology but also because he saw a team in Tashan Technology that has brought haptic perception from the laboratory to industrial implementation.

So, thirty years after the publication of the Reinforcement Learning theory, theory and technology in the Embodied Intelligence industry have worked together: The academic visionary has found an ally who can put the theory into practice, and Tashan Technology has filled the theoretical gap in the haptic acceleration of robot training.

Robot Kindergarten: "Enlightenment" in the real environment

The specific goal of the cooperation between the two sides is the "Robot Kindergarten".

In Tashan Technology, Sutton observed Chinese students in robot courses and was amazed by the open environment of Embodied Intelligence in China. Humans and robots can behave more naturally with each other, and thus the idea of the Robot Kindergarten was born.

The Robot Kindergarten is a training system for the continuous learning ability of robots that combines haptics and multimodal experiences. It integrates the real physical environment, the simulation environment, multiple robots, haptic and multimodal perception devices, task courses, data collection, and evaluation mechanisms, so that robots can collect learnable experiences through repeated touching, trying, failing, and correcting.

Why "Kindergarten"? Ma Yang says that the current Embodied Intelligence is like a baby between 0 and 3 years old. We see in videos that robots can do different things and find it impressive. In fact, however, the success rate is not high, and the robot also doesn't know whether it was successful or not. "It just does something, and people applaud."

It is difficult for robots to understand what they have done right from the correct demonstrations by humans. Because the concept of "right" is very vague and covers a wide range. Only mistakes have boundaries. Only enough failed attempts can show a robot where the boundaries of a task lie and how it should adjust the next action.

"The sense of safety in Embodied Intelligence is not defined by a line but learned through objective interaction."

Ma Yang is convinced that just as the human safety instinct is not only acquired by reading manuals but also through repeated touching, falling, and adjusting, robots can only understand what is unsafe through enough real failed attempts. If robots can set their own safe action boundaries, they can protect themselves and also ensure the safety of others.

After Sutton's visit to Tashan Technology, the cooperation was quickly promoted, and on May 11, 2026, the contract was signed.

At the contract signing, Sutton talked about the significance of the cooperation: "Even when we were students, the idea was put forward to create a robot like a baby that interacts with the world and grows through experiences. This idea was almost impossible to implement at that time. Now we have enough computing power and enough experience with robots, but I think the missing key component was the clear recognition of the value of this goal. It not only takes money but above all time and patience."

Sutton indicated that he was surprised during his visit to Tashan Technology that this Chinese company understood this. The entire cooperation plan lasts for five years and aims to find the best learning methodology for Embodied Intelligence.

Venue of the contract signing

In the next step, the "Robot Kindergarten" will build a real environment and train robots in it. Although the initial training phase is carried out with homogeneous robots, Ma Yang is convinced that heterogeneous robots will not pose major learning obstacles in later phases. If an agent understands the underlying logic of a task, the different form of the robot will not hinder the learning and experience transfer.

In comparison, it is now more important to face the real environmental variables.

Ma Yang has openly said that the hardware in the Embodied Intelligence industry is already at a level of 60 points, but it lacks inference ability and continuous learning ability. Without these two abilities, there can be no better generalization and deduction, and the entire industry will be drawn into parameter optimization and find no broader application areas.

Therefore, early learning must constantly interact with the real environment. The training environment must not deliberately avoid the variables and unfavorable factors in the real scene, otherwise, the robot has a low experience limit and can hardly make progress.

The cooperation between Tashan Technology and Sutton also aims to find a new way. "There are no high - tech solutions for this topic, only the choice of the right methodology."

Prerequisite for commercialization: "Learning while working"

The methodology must finally be tested in application scenarios. Ma Yang has also made a very realistic assessment of the commercial implementation: In the next three to five years, Embodied Intelligence is most likely not to be used in scenarios with high logic and high time - pressure requirements.

It is more suitable for replacing a certain type of work: Work that people don't like to do and where the error rate must not be too high.

This type of work has three characteristics: The tasks are repetitive but not completely on a fixed assembly line; the success rate must be very high because a failure may interrupt the work and require strong human intervention; the time - pressure requirement for individual tasks is relatively low and does not require a second - level reaction.

Ma Yang has given some examples: One example is the service area, e.g., dishwashers in restaurants in North America. Their task is to rinse the dishes and put them in the dishwasher. The action is simple but boring and tiring. Currently, hundreds of thousands of people work in this position in the United States. If robots can achieve a high enough success rate for this action, it can create enormous commercial value. At the same time, there is no high...