HomeArticle

Can an army of 600,000 data collectors solve the data hunger of the embodied AI industry?

AI价值官2026-04-23 19:54
The competition for embodied intelligence data has reached a white-hot stage, with both financing and production capacity on the rise in 2026.

In April 2026, the embodied intelligence industry reached an important and intensive milestone.

Tashizhihang, which had been established for only 14 months, completed a $455 million Pre - A round of financing, setting a new record for single - round financing in the field of embodied intelligence in China. Almost at the same time, Guanglun Intelligence disclosed orders worth 550 million yuan in the first quarter and confirmed that it had completed a 1 - billion - yuan financing in March, becoming the world's first unicorn in embodied data.

JD.com also released its phased progress in embodied intelligence, unveiling the industry's first full - link infrastructure for embodied data. It mobilized a collection team of 100,000 employees and 500,000 social personnel, aiming to build the world's largest embodied data collection center.

These three events all point to the same industry consensus: Data has become the core variable for embodied intelligence to break through the bottleneck of implementation. The reason why the above - mentioned enterprises were able to initiate strategic layouts is precisely due to their differentiated accumulations in data collection, processing, and application.

This series of intensive actions has accelerated the long - brewing competition in embodied intelligence data from the technology laboratory to industrial implementation. As of early 2026, the total amount of high - quality real - world physical interaction data globally was only about 500,000 hours, less than one - twenty - thousandth of the training data for large language models.

It is this extreme scarcity of resources that makes data the core bargaining chip in the competition of the embodied industry.

Four - pronged approach to embodied data collection, the problem of data silos remains to be solved

After three years of trial and error, the industry has emerged from its initial confusion and differentiated into four technical routes, which address the data problem from four dimensions: real - machine teleoperation, portable collection, simulation synthesis, and human natural demonstration. None of them is perfect, but each of them reached its own critical node in 2026.

This is also the reason why capital is willing to place intensive bets at this time: these routes are no longer just directions in the laboratory, but the possibility of large - scale implementation is starting to be seen.

Real - machine teleoperation data collection is recognized as the highest - quality solution in the industry. Zhiyuan Robotics built a 4,000 - square - meter exclusive factory in Pudong, Shanghai, pushing the daily data output per machine to the level of thousands, specifically to tackle high - difficulty tasks such as precision assembly and complex operations. However, cost remains an unavoidable issue for this route - the cost of effective data per hour is still over 500 yuan, and the threshold for operators to get started is extremely high.

Portable collection (UMI route) is currently the direction with the fastest large - scale implementation speed. Companies such as Luming and Ant Lingbo have successively launched mass - production solutions, liberating data collection from the "data factory" and allowing it to penetrate into real - world scenarios such as homes, offices, and convenience stores.

However, the "Achilles' heel" of this route is also clear: the mainstream UMI devices collect gripper actions, lacking tactile and force - feedback. They are ineffective when it comes to the precise operations of five - finger dexterous hands. Moreover, the data quality is uneven. Without real - time quality control, most of the data collected in a week may be useless.

Simulation - synthesized data is currently the route with the lowest cost and the largest production capacity. Guanglun Intelligence is the player that has gone the furthest in this area. It has self - developed a unique physical simulation engine in the industry, which can accurately reproduce the laws of object movement and deformation in the real world. Through the cyclic data flywheel of "test - generate - retest", it can quickly produce a large amount of standardized training data.

However, the challenges of this route are still prominent: the virtual environment can never fully simulate all the unexpected situations in the real world. When the model is migrated from simulation to real machines, slight deviations in physical parameters such as friction and damping may still cause the actions to fail.

Human natural demonstration data collection is another direction favored by capital this year, and Tashizhihang is the representative of this route. The five - finger intelligent gloves it developed can accurately capture the movement trajectory and operating force of the human hand. Factory workers and production - line workers can have their hand spatial poses, finger postures, and operating forces fully captured as long as they work normally in the real environment.

However, Tashizhihang also faces many problems: the cost of a single set of gloves exceeds 10,000 yuan, and the threshold for large - scale promotion is still relatively high. The operation habits of different occupations vary greatly, and the difficulty of data standardization governance restricts the application and promotion in more industries.

Although all four routes are making breakthroughs, they have not solved a more fundamental problem: the data cannot be interconnected. Each company's data format and annotation standard are self - contained, forming data silos. The multi - modal data is out of sync in time and space, and dirty data is rampant, resulting in the so - called "garbage in, garbage out". On the supply side, although there are scenarios and collection capabilities, there is a lack of a standardized governance and circulation system, and a large number of algorithm prototypes are trapped in the laboratory and cannot be mass - produced.

What's more troublesome is the structural dilemma of "data following the body": the sensor layouts and control modes of robots of different brands and models vary greatly. The data collected by teleoperation is highly dependent on specific hardware and cannot be reused across different bodies. Every time a new set of robots is used, data needs to be collected again, and data assets cannot be accumulated into a real industry public wealth.

This is the real difficulty in the industry competition in 2026 - all four routes are accelerating on their respective tracks, but the isolation and fragmentation of data determine that no single route can support the future of general embodied intelligence.

The crowdsourcing model has become a popular bet. How can both production capacity and data quality be achieved?

JD.com's entry into the competition brings a completely different way of solving the problem.

It has chosen a "hybrid data route" that spans the four routes - instead of betting on a single technology, it uses its most proficient supply - chain logic to integrate the advantages of the four routes and directly tackle the most thorny problem of "data structure imbalance" in the industry.

Judging from the "data pyramid" architecture announced at the press conference, JD.com covers almost all mainstream collection paths. The bottom layer of the pyramid consists of tens of millions of hours of first - person human perspective videos, mainly from the daily operations of more than 3,600 warehouses and tens of thousands of offline stores across the country. It follows the UMI/Ego route, solving the most headache problem of insufficient basic data in the industry.

Above that is millions of hours of human hands - on operation data, which is augmented with the self - developed JoyBuilder simulation platform to supplement the action planning and cross - body generalization ability. At the top are the high - value data generated by teleoperation and UMI variants, which are used for the fine - tuning of specific robot bodies.

Based on this data system, JD.com launched the JoyAI - RA embodied basic model, adopting a two - stage architecture of "WAM pre - training + RL post - training": first learn causal decision - making from a large number of first - person perspective videos, and then continuously optimize through real - world interaction feedback.

JD.com summarizes this logic as "not being a silent mine, but providing an instruction manual for human hands - on operation data" - it aims not to be a simple data porter, but to transform the originally scattered and meaningless productive labor into standardized and reusable training data. At the same time, JD.com also released the industry's first embodied intelligence data trading platform, trying to break through the last mile of data circulation.

The core advantage of this solution lies in scale. The goal of "accumulating 10 million hours of first - person perspective video data of human real - world scenarios within two years with 600,000 people" essentially attempts to transform the existing productive labor into a data production pipeline, rather than building a dedicated data factory from scratch. If this model works, it will reduce the collection cost of basic visual data by an order of magnitude - this is a structural advantage that most start - up companies can hardly replicate.

In fact, JD.com is not the only one betting on the crowdsourcing collection model. Luming Robotics plans to deploy 10,000 backpack - style UMI devices to conduct systematic collection in six real - world scenarios and build a UMI community of thousands of people. Qiongche Intelligence has launched a "pocket collection" product that can be collected using a mobile phone and is conducting small - scale crowdsourcing tests. Ant Lingbo, Mifeng Technology, etc. have also adopted the crowdsourcing model to expand data production capacity.

However, these pioneers have generally encountered similar dilemmas: the lack of unified collection standards leads to uneven data quality. The insufficient real - time quality control ability makes a large amount of collected data useless. The standardization governance of data from different scenarios and different occupations is even more difficult. Some companies have also been forced to adjust the collection scope due to privacy and compliance issues.

JD.com's advantage lies in its real - world scenario resources and mature large - scale organization and management ability, which can deeply integrate data collection with existing business processes and is equipped with a full - link data governance infrastructure.

However, it also cannot avoid the common problems of the crowdsourcing model: how to control the "impurity rate" of data while ensuring scale, how to solve the alignment problem between human natural actions and robot control logic, and how to ensure full - process privacy compliance in the collection of behavioral data involving hundreds of thousands of people.

Although there are still the above - mentioned challenges to be answered, JD.com's entry does provide a new possibility for the industry: not all companies have the ability to build their own robots, and not all companies can achieve the best in a single technical route. However, all companies need an infrastructure that can provide all types of data and full - link services.

Open platform V.S Vertical closed - loop, who can define the future of embodied intelligence?

If data is JD.com's ticket to enter the market, then its full - link supply - chain ability is its real trump card.

JD.com does not want to be just a data company or a robot company - its positioning is closer to an industrial intermediate layer: it uses data, computing power, supply chain, and channels to connect the innovation ability of robot companies with the consumer market.

On the supply side, JD.com not only provides real data collection scenarios and cloud computing power support but also has a global supply - chain network - from one - stop procurement of core manufacturing materials, to full - link assembly solutions, to overseas services on the Joybuy platform, helping robot companies solve a series of problems from production to sales.

On the demand side, JD.com has offered its most core retail resources: super category days, robot channel pages, and platform marketing IPs, combined with zero - threshold self - operated entry and full - service before, during, and after sales, directly converting traffic into sales.

In 2026, JD.com's retail division aims to help robot brand partners achieve sales of over 10 billion yuan. For most robot start - up companies, the biggest pain point is often not technology but commercialization - they can build prototype machines, but it is difficult to achieve large - scale mass production, and even more difficult to sell products to consumers. JD.com's full - link service addresses this gap.

The JoyInside embodied intelligence platform is the core link for JD.com to connect with hardware manufacturers. Through a model of zero service fees and limited - time free access, JD.com injects large - model interaction ability into various types of hardware. So far, it has attracted nearly a hundred home appliance and home furnishing brands and more than 40 robot and AI toy brands to cooperate. From quadruped robots to humanoid robots, from AI toys to cleaning robots, as long as they are connected to JoyInside, they can quickly obtain large - model interaction ability, greatly shortening the product R & D cycle.

This approach allows JD.com to skillfully avoid direct competition with Tesla and Figure AI in the field of humanoid robot bodies and instead enter the middle layer of the industrial chain. In this way, JD.com can continuously accumulate data from different scenarios and optimize its own model in turn, forming a positive cycle of "data - model - product - more data".

We can make a simple comparison between JD.com's industrial open - platform model and Tesla's data closed - loop model.

Tesla follows a vertical route of "the body is the data factory". The Optimus Gen 3 is planned to be released at the end of 2026 and has currently undertaken basic test tasks such as screwing and material handling at the Fremont factory in the United States. The data generated by the robot's operations is naturally aligned with its own hardware and can be directly iterated without conversion, leading the industry in data efficiency. This is a completely closed ecosystem, and all data only serves Tesla's own robots and currently only covers standardized factory scenarios.

On the other hand, JD.com has chosen an open - platform route for the entire industry. If Tesla is building "the best single - model robot" for itself, then JD.com is creating "the best industrial soil" for all robot companies.

Tesla's data can only serve Optimus, while JD.com's data ability can be opened to all partner brands. Tesla only cares about whether its own robots can be mass - produced, while JD.com hopes to help all start - up companies solve the pain points in the entire chain from core component procurement, large - scale assembly to terminal sales and after - sales service.

These two routes each have irreplaceable value. Tesla's barrier lies in the deep coupling of hardware and data, while JD.com's barrier lies in the breadth and depth of the industrial ecosystem. They are not in a mutually exclusive competitive relationship but are jointly promoting the progress of the industry in different dimensions.

Conclusion

Perhaps the most notable aspect of this competition in embodied intelligence data lies in its openness - data, unlike chips, cannot be monopolized by one or two companies. For robots to truly enter factories and families, the synergy of the entire industrial ecosystem is required, rather than the sole victory of a single route.

In the longer term, when the deployment scenarios are wide enough and the number of robots is large enough, the production of data will change from today's "active collection" to "passive emergence" - every robot operating on the production line and every household chore completed at home may become the training material for the next - generation model.

By that time, the data flywheel will truly start to turn, and embodied intelligence will evolve from today's data desert into a self - irrigating and continuously growing ecological rainforest.

This article is from the WeChat official account "AI Value Officer", author: AI Value Officer, published by 36Kr with authorization.