"Star Movement Era" releases the end-to-end native robot large model ERA-42, the first five-fingered dexterous hand that can complete over a hundred types of tasks | Frontline
Author|Huang Nan
Editor|Yuan Silai
Hardcore Keji has learned that Xingdong Era recently released the end-to-end native robot large model ERA-42. Combined with its self-developed five-finger dexterous hand Xingdong XHAND1, it has achieved for the first time that with just one embodied large model, it can drive the five-finger dexterous hand to use various tools to complete more than 100 complex and delicate operation tasks, including picking up screws and fastening them with a drill, hammering nails, straightening the water cup and pouring water, etc.
Based on ERA-42, Xingdong XHAND1 can complete various new dexterous operation tasks with different tools
In terms of versatility and dexterous operation ability, ERA-42 does not require any pre-programming skills and has strong generalization and adaptive capabilities. Based on a small amount of data collection, it can learn new tasks in less than 2 hours and continue to learn more new skills quickly.
Based on ERA-42, Xingdong XHAND1 can complete more than 100 kinds of refined and intelligent five-finger dexterous hand operation tasks
Xingdong Era points out that as the key to opening the general embodied intelligent agent, the embodied large model needs to have the following three elements. First, unify one model to generalize multiple tasks and environments; by constructing a unified native model, integrating full-modal information such as vision, language, touch, and body posture, to achieve the generalization ability for different tasks and environments.
The second is end-to-end, from receiving full-modal data to generating the final output such as decisions and actions, completed through a simple neural network link. This process does not require artificially designed features, pre-programming or intervention processing steps, enabling the embodied intelligent agent to adapt to different tasks and environments in real time, significantly improving flexibility and development efficiency.
The third is Scaling up, allowing the model to continuously improve itself through the accumulation of data, so that the embodied large model not only improves performance but also shows excellent adaptive and generalization capabilities in unknown tasks as the amount of data increases exponentially. For example, the π0 model released by Physical Intelligence (referred to as PI) has the above elements and is a typical end-to-end embodied large model in the true sense.
Based on the end-to-end algorithm, Xingdong Era has adopted a large-scale video data learning strategy, covering unlabeled video data, data of various types of robots in the public domain, human activity data, and teleoperation data, etc. Learning the action results based on the use of the above data can effectively reduce the cost of data collection.
Xingdong Era explores the native robot large model that integrates the world model
In addition, the Xingdong Era team has integrated the world model into the native robot large model, so that the model not only has the ability to act but also has the ability to understand the physical world, can predict the future action trajectory, and quickly respond to external interference, continuously and adaptively optimize the behavior during the task execution until the task is completed, improving the efficiency and accuracy of the robot in performing tasks.
In practical applications, compared with the traditional gripper robot, the five-finger dexterous hand Xingdong XHAND1 based on the ERA-42 capability can use a variety of tools to complete more general, more dexterous, and more complex operation tasks. For example, after simple training with colored cube grasping data, ERA-42 can successfully achieve the generalization of grasping various unseen objects.
Based on ERA-42, the five-finger dexterous hand Xingdong XHAND1 can use more different tools to perform more dexterous operations compared with the gripper
When facing single tasks or long sequence tasks, ERA-42 shows a strong anti-interference ability. Tests show that as the model scale expands, the task success rate also significantly improves, initially reflecting a "Scaling effect" similar to that in the training of large language models.
In terms of hardware, in order to build a general embodied intelligent agent, Xingdong Era has launched a new hardware platform defined for AI. Taking Xingdong XHAND1 as an example, it has a total of 12 active degrees of freedom and uses a pure electric drive to achieve fully autonomous five-finger joint drive. Each finger is equipped with a high-resolution tactile array sensor that can provide accurate three-dimensional force tactile and temperature information.
Relying on ERA-42, the versatility and generalization of the Xingdong Era general humanoid robot in performing tasks will be greatly improved. Combined with the previously released motion performance of stable walking and running on complex and diverse terrains, and with the ability of coordinated operation of the upper and lower limbs, the potential application scenarios will also be more diversified, truly realizing the industrialization of the native general embodied intelligent agent.
Xingdong XHAND1