Dialogue with Qiongche and Luming: The Debut of UMI, the Moment of Equal Rights for Embodied Intelligence Data
Author: Peng Kunfang
Editor: Lü Xinyi
Produced by: Embodied Learning Institute
The data bottleneck in embodied intelligence has not been resolved yet, but fortunately, we have reached the moment of "data equality".
Previously, the main issue constraining data was simply a "scarcity". Even a million - hour - level dataset was not only in the "wishful thinking" stage but might not even quench the thirst. The essence lies in the fact that the current data volume is far from the ideal and feasible state. Especially for the real - machine data at the top of the data pyramid, which has higher quality, its remote - operation collection method has structural limitations: high cost of the main body, complex deployment, low collection efficiency, and the data is limited by the configuration of the main body.
There are obvious volume bottlenecks in remote operation, and the simulation data with volume advantages has an unbridgeable Embodiment Gap.
To use a not - so - rigorous analogy, data is like a famine. The real - machine and simulation routes are like having either rice or vegetables, but neither can make a full meal.
Today, this situation is changing.
A real - data collection path towards large - scale, diversified, and high - quality data has been truly established. It has a smaller GAP than simulation data and a more obvious volume advantage than real - machine remote - operation data: UMI (Universal Manipulation Interface).
To put it simply, it is a low - cost data collection solution that directly converts human hand gestures into robot - learnable trajectories through a handheld gripper, a camera, and a pose estimation algorithm. This new paradigm solves a series of problems such as high cost, low efficiency, non - reusability of data across different main bodies, and limited data diversity in real - machine data collection.
"In 2026, we hope to establish a production capacity of 1 million hours of embodied real - machine data." Dr. Ding Yan, the co - CTO of Luming Robotics, said in a conversation. Dr. Lü Jun, the product leader of Qiongche Intelligence's RoboPocket, also said that they have started small - scale tests of crowdsourced data collection. "The era of all - staff data collection may come earlier than we think."
In terms of the paradigm itself, UMI makes data no longer just an expensive and scarce resource and no longer just the inherent advantage of a few leading enterprises through lower hardware costs and higher output efficiency. In terms of the ecosystem, thanks to the particularity of the UMI paradigm, data collection no longer needs to be confined to data collection factories but can move into the real physical world to restore more real tasks.
UMI is initiating a kind of "data equality" in a sense.
Under the "bright side" of the new technology, new problems have also emerged. Can easily obtaining data through simple hardware lead to an extreme route of over - pursuing volume? How can this "violent aesthetics" balance high - quality and diverse data?
More importantly, what does UMI mean for embodied intelligence?
Recently, upgraded and improved UMI products have emerged intensively. The Embodied Learning Institute had a dialogue with technical experts from representative domestic enterprises, Dr. Ding Yan, the co - CTO of Luming Robotics, and Dr. Lü Jun, the product leader of Qiongche Intelligence's RoboPocket, around UMI. From a technical perspective, we will see a more real situation of data collection and future development trends.
What is UMI?
In the original Stanford paper, UMI was described as a collection solution of "gripper + vision system": by deploying lightweight sensors and cameras on the human operator's hand or end - tool, it directly records the trajectory, timing, and environmental feedback during the operation process.
Later, teams such as Generalist and Sunday brought UMI from academia to the industry on this basis, initiating large - scale real - machine data production. (Cheng Chi, the co - founder of Sunday, is one of the two first authors of the 2024 UMI paper.)
Image source: Sunday
In China, Dr. Ding Yan of Luming tried to use a joystick to remotely operate a robotic arm to collect data when he was studying for a doctorate in the United States. He found the process very cumbersome and tiring. At that time, he had a wish: could he "remove" the bulky robotic arm and let people directly operate with a gripper? After seeing the relevant work on UMI in March 2024, he found that it was completely in line with his idea of "being lazy" and only focusing on the front - end operation.
The team of Qiongche started working on remote - operation datasets as early as 2021. However, they found that the "data collection factory" model had three major bottlenecks: extremely high cost, non - intuitive operation (the actions were mechanical due to operating at a distance), and single - scenario (far from the real world). Therefore, they gradually developed a collection solution that moved away from the machine and the laboratory environment, from the "main body" to the "exoskeleton" and then to "UMI".
So, the UMI we see today actually uses humans to replace the "main body" of the robot to some extent, allowing people to move in the real environment and operate a robotic gripper to generate operation data. If we must define the form of UMI data, it is more like an intermediate state between Robot Data and Human Data: it is neither the same as learning human data from pure Internet videos nor the same as the strongly - coupled main - body remote - operation data.
However, it should be emphasized that there is no successive substitution relationship among these three, and there is no question of which is better or worse. In reality, embodied intelligence enterprises will use them in a mixed and on - demand manner according to their own data utilization capabilities.
So, why did UMI make a group of manufacturers "amazed" in a short period of time? This is one of the most obvious features of the UMI data collection paradigm, which is that the price is low enough.
Most intuitively, there are two reasons: first, it continuously refreshes the upper limit of the scalability of real data, causing the long - standing consensus that "real - machine data is difficult to scale" to be shaken; second, a clear closed - loop has been formed between UMI data and model training, proving that this type of data can not only be collected but also train models with good results.
For the entire embodied intelligence industry, this is a shock. The industry is not just excited about a certain collection technology but because data no longer only belongs to the "top players".
Image source: Luming Robotics
Taking Luming as an example, its FastUMI Pro has an order - of - magnitude reduction in cost and efficiency compared with the traditional remote - operation solution. Only calculating the labor cost, the UMI solution is 1/5 of the remote - operation solution. If the hardware cost is calculated, it reaches an astonishing 1/200; the collection efficiency is increased by 3 times.
Image source: Qiongche Intelligence
In addition, Qiongche Intelligence took a different approach. RoboPocket directly uses an iPhone as the core hardware solution, maximizing the reuse of existing intelligent terminals and compressing the pre - research and deployment costs. In Dr. Lü Jun's view, "the mobile phone is a very good hardware", and it is not easy to surpass it.
This means that large - scale real data is no longer just an "exclusive game" for well - funded leading manufacturers. Enterprises in the second and third tiers, which were previously restricted by data costs, also have the possibility of participating in data competition for the first time.
Meanwhile, UMI decouples the data from the robot main body at the data level. The same set of collected data can be adapted to robotic arms of different configurations. This enables enterprises not to be forced to lock in configuration choices due to "data binding" and not to be attached to a certain existing data framework.
In terms of results, the decrease in data cost means that the industry is no longer just about "who has the most main bodies can produce data by brute force".
In fact, in the past few months, UMI has not always been on the table for discussion. The core reason is that the industry has always had doubts about its data quality. After all, without high - quality data, UMI is not only ineffective but may even be a kind of "poisoning".
There was a saying that the proportion of truly usable data collected by the previous UMI solutions might be only 10%. Therefore, a key question has remained unresolved for a long time: Can the data collected by UMI really train a usable model?
At the end of 2025, the situation began to change. Overseas embodied intelligence manufacturers trained models such as Generalist's GEN - 0 and Sunday's ACT - 1 under the UMI data collection paradigm, initially proving that this path is feasible.
Image description: Sunday's job description for the data collection position. The first requirement is to follow the SOP to ensure data quality.
It was also at this stage that the industry began to realize that what really needs to be discussed around UMI is not whether a lot of data can be collected, but how to manage these data to ensure quality.
UMI is easily misunderstood as "using a camera to record the process of a person operating a gripper". In fact, UMI is a record of interactive behaviors that AI can understand and align with the physical world and can be reproduced in the physical space. It must meet the standards in multiple dimensions such as trajectory accuracy, timing consistency, and picture quality.
Dr. Ding Yan once wrote an article answering why a large number of UMI devices cannot collect "data that can train models". The reasons include insufficient performance of the core hardware, which leads to inherently limited information density, and the fact that the devices are not system - level products but assemblies of sensors. These result in the low quality of the collected data, which cannot enter the training pipeline.
He believes that "data is essentially a replayable embodied interaction trajectory". Learning from real data means reproducing this kind of action trajectory. If the data quality cannot be guaranteed, such as trajectory deviation or breakage, the robot cannot reproduce the actions demonstrated by humans. In his words, "This is like an open - book exam. If the answers themselves are wrong, copying more will not get a high score."
This has led to the emergence of a "feed - forward" data management solution.
Luming Robotics chose to emphasize the data collection SOP and established an industrial - level data quality evaluation system with 8 processes. Previously, Dr. Ding Yan led a team of 11 people to collect 100,000 real - machine data records (FastUMI - 100K) and more than 2,000 hours of data in 3 months, which brought his team rich data understanding and large - scale data management experience. Moreover, the FastUMI Pro device can be directly connected to a computer, allowing users to verify the validity of the data in real - time during the collection process, avoiding low - quality data from the source and increasing the data effectiveness rate to over 95%.
Qiongche pays more attention to the management of data collection personnel. In Qiongche's view, data collection personnel lack "constraints" in the process of no - main - body and distributed data collection. The former means that the data collection personnel lack the hardware constraints isomorphic to the robot main body, and there may be differences between the human working space and the machine main body; the latter means that there will be problems in the indirect management of data collection personnel, including guidance, correction, and efficiency guarantee for remote data collection.
Qiongche Intelligence's newly released RoboPocket integrates its understanding of model training into a "Data Tutor App". It can issue task instructions, provide real - time interaction reminders, and give multi - dimensional quality scores. It also controls the quality at the collection stage to avoid a large amount of invalid data in subsequent data processing. It is reported that Qiongche Intelligence is conducting small - scale tests of the crowdsourced data collection model internally and may even subcontract data collection to a wider range of ordinary people in the future.
Image source: Qiongche Intelligence
As Dr. Lü said, Qiongche may launch a small - scale hardware set of RoboPocket for ordinary users at a price of a few hundred yuan in the future. Users can use their mobile phones to cooperate with this hardware to complete data collection tasks in the home scenario. This can continuously reduce the data collection cost and obtain diverse real - home - scenario data, which in turn can feed back to the model optimization and iteration.
In short, in addition to data scale, data should also be responsible for training models, and only truly useful high - quality data can forge a model capable of fine - grained operations.