Playing on the Edge: The Big Internet Companies' Cunning Plans for Embodied Intelligence
Today's big tech companies actually stand at a crossroads: Should they continue to develop smarter software or start learning to deal with materials, machinery, supply chains, and factories? Should they remain as ecological enablers or become true industry shapers? The former is reasonable, while the latter is difficult. However, the latter might hold the key to reshaping the next decade.
The business rationality of internet giants has crowded out the allure of the embodied intelligence track.
Whether it's the metaverse, large models, or the once-popular educational hardware and VR, in the technology hotspots of the past few years, the presence of internet giants has been ubiquitous. With huge cash reserves, these big tech companies have always been quick to seize new opportunities, launching all - out offensives.
However, when it comes to the unprecedentedly popular embodied intelligence track, an unexpected scenario has unfolded. Internet giants have appeared in a rather delicate position. They are neither absent nor fully committed.
In other words, they have never truly "mustered the courage" to cross the dividing line of "full - stack betting".
This is a very typical "marginal entry" approach. They focus on developing large models, building platforms, and making venture capital investments, but few are willing to dive into factories to work on the hardware of humanoid robots. This "software - centric" tendency is a result of their internet background and genes, and it also reflects their rational business logic. Given the relative stability of their core business, it's neither necessary nor interesting for them to pour resources into the unfamiliar, limited - scale, and seemingly endless hardware field.
Admittedly, for big tech companies, waging peripheral battles is the way with the lowest trial - and - error cost, allowing them to leverage maximum returns with minimum investment.
However, embodied intelligence is a field where software and hardware are strongly coupled. The application of embodied large models relies on hardware adaptation and the subsequent rotation of the data flywheel. It's not a business story that can be realized simply by developing a large model and building a platform. In this area that most needs imagination and passion, internet giants have been conservative.
Whether it's conservatism or waiting for the right opportunity to strike, it's still too early to draw a conclusion. But I still prefer their unruly and bold style.
The Capability Boundaries and Interest Trade - offs of Big Tech Companies
The entry of internet giants always comes with a sense of "distance". Their strategies for deploying embodied intelligence show obvious commonalities: they focus on software and platforms and avoid heavy investment in hardware (after all, there have been many failed cases).
Tencent launched the embodied intelligence open platform Tairos, emphasizing that "it hopes to be a partner for all robot manufacturers rather than replacing them by making hardware". Alibaba's investment in the field of embodied intelligence mainly focuses on software aspects such as robot simulation training, multi - modal perception, and general robot brains. Although JD.com has invested in several robot companies, it has launched the embodied intelligence brand JoyInside, which focuses on providing interaction solutions for robots. ByteDance has developed the GR series of robot large models but relies on investments to supplement its hardware capabilities. (However, Ant Group is actually working on hardware.)
Image source: Tencent
The cautious stance of internet giants is essentially the result of the combined effect of their technological endowment and business rationality.
On the one hand, the genes of internet giants determine that they are better at the "software side". Their core capabilities are concentrated in algorithms, distributed systems, model training frameworks, and data operation systems. These capabilities are extremely valuable in the era of large models, making them naturally suitable to play the role of "enablers" rather than "manufacturers".
Take ByteDance as an example. Last year, ByteDance officially released its second - generation robot large model, GR - 2. This year, ByteDance's cloud and artificial intelligence platform, Volcano Engine, is stepping up its research and development in "embodied intelligence" and humanoid robots. It has quietly posted a job opening with an annual salary of one million yuan, recruiting senior experts in humanoid robots with a monthly salary of up to 120,000 yuan. This position requires leading the research and development of operation algorithms for humanoid embodied robots, including algorithm architecture design, grasping algorithms, and VLA model research.
Currently, ByteDance has achieved some results in the field of embodied intelligence. In July this year, it launched the general robot model GR - 3, which can handle long - range tasks and perform dexterous operations. In September, it released the "robot brain" Robix, which integrates reasoning, task planning, and human - robot interaction capabilities. The ByteDexter dexterous hand can perform fine actions with 20 degrees of freedom under remote control.
On the other hand, the rationality stems from the huge cost and uncertainty of hardware investment. Robot body manufacturing involves mechanical structure design, motors, sensors, overall machine development and testing, and supply chain integration, which requires a large amount of capital investment and long - term technological accumulation. Internet giants lack experience in the hardware field and face high trial - and - error costs. This logic is also reflected in the attempts of internet giants to enter the automotive industry. Although JD.com and Huawei are closely associated with the automotive industry, they still haven't chosen to open a workshop.
In sharp contrast, automobile manufacturers and hardware companies have "dived head - first" into the field. Xiaomi's CyberOne project was launched early, and GAC Group has clearly stated that it will start mass - producing embodied intelligence robots in 2027. Recently, XPeng Motors has sparked discussions with its so - called "most human - like" humanoid robot, IRON. Just look at IRON's compact body, which has undergone repeated engineering design refinements and innovations in joint motor technology, making its structure so compact. This manufacturing gene is exactly what big tech companies lack.
The "software capabilities" accumulated by internet giants over the past two decades not only constitute an advantage for entering the field of embodied intelligence but also form a path dependence that is difficult to break through in the short term. However, further analysis shows that the "knowledge blind spot" in the hardware field is not the only factor hindering their entry. Internet giants have actually done the math.
The current stage of the embodied intelligence industry determines that the "marginal entry" of big tech companies is a rational choice for maximizing interests.
Taking humanoid robots as an example, in terms of market scale, a research report shows that the total scale of the Chinese humanoid robot industry in 2024 was approximately 2.76 billion yuan. According to a report recently released by KPMG, the global market scale of humanoid robots in 2024 was only 2.03 billion US dollars. Although it is further predicted that it will reach 13.25 billion US dollars by 2029, with an average annual compound growth rate of about 45.5%. In contrast, internet giants have already achieved tens of billions in revenue. In other words, embodied intelligence has not yet become a "new growth curve" that can affect the revenue structure of big tech companies.
More importantly, the "core business" of big tech companies remains stable. Alibaba's e - commerce and cloud computing, Tencent's social media and gaming, and ByteDance's short - video and live - streaming still provide stable cash flow and profits. In this situation, the ROI of heavily investing in embodied intelligence hardware is extremely low. They not only need to bear the risk of R & D failure but may also face the dilemma of "high investment, slow return", getting stuck in a situation of "tasteless to eat but a pity to abandon".
Just as leading battery manufacturers are reluctant to develop special - purpose batteries for robots. The existing power battery market is sufficient to support their performance, and there is no sign of new growth from robot - specific batteries in the short term. Therefore, internet giants also adhere to the principle of "keeping up with the technology", constantly monitoring industry trends to ensure no technological gap and being ready to increase investment through investment and cooperation when the market matures.
There Can Be More Allure Behind the Rational Choice
Although the "marginal entry" choice of big tech companies is in line with their own interests, it seems conservative. You may even find that their current approach contradicts the underlying logic of the development of embodied intelligence.
The most obvious awkward point is the breakage of the data flywheel. This includes two aspects. Firstly, the data flywheel of hardware deployment has not started to spin. Secondly, the data accumulated by internet giants has not been fully utilized.
The core driving force for the growth of embodied intelligence is the "data flywheel". Robots interact with the physical world through real - machine deployment, generating closed - loop data of "perception - decision - execution". This data feeds back into model iteration, continuously improving the intelligence level. However, since big tech companies do not manufacture robot bodies, they cannot control the deployment scale, task types, and environmental diversity, resulting in a broken data collection link and making it difficult for the flywheel to accelerate.
Similarly, it's a pity that the massive amount of internet data accumulated by big tech companies has not fully realized its value. Alibaba, ByteDance, and Tencent possess a large amount of human - robot interaction data, AI application data, logistics data, and merchant data. This data can fully play a role in the implementation of embodied intelligence, including improving product interaction capabilities and promoting precise product marketing.
Image description: JD.com's embodied service system
However, it's refreshing to see that some big tech companies have recognized the potential of this data. JD.com's layout in embodied intelligence (especially JoyInside) is a typical example. It uses its e - commerce supply chain, logistics system, user interaction data, and scenario data to provide all - around support for the application implementation of embodied intelligence enterprises. As a result, JD.com has won the ecological position of "buying robots on JD.com".
Another less obvious awkward point involves a misunderstanding of the software and hardware markets.
The fact that big tech companies only focus on developing models and solutions essentially stems from the cognitive misunderstanding that "software is high - end and hardware is low - end manufacturing". They adhere to the traditional industrial division of labor logic of "software as the core and hardware as the supporting part". This approach may seem safe, but it may push the industry into an extremely competitive and involuted state.
Currently, internet giants generally focus on providing embodied intelligence software solutions and general model capabilities. However, this path has a serious problem: the software solution market is naturally a "winner - takes - all" market. Referring to the "DiDaHuaMo" in the intelligent driving industry, only a few solution providers will survive in the end.
Currently, the model capabilities of big tech companies show a trend of homogeneous competition. Their core functions are all "natural language interaction + task planning", with differences only in the number of interfaces and the types of compatible hardware. In this competitive situation, there is neither the possibility of one company dominating nor the charm of technological divergence.
Therefore, a deeper problem arises. If big tech companies only develop models in isolation, the models will become "arrows without a target" detached from hardware. The essence of embodied intelligence is a symbiotic entity of "hardware carrier + intelligent model". Without compatible hardware, even the most advanced models cannot play their value. Due to the lack of self - developed robot bodies, platforms can only connect with external hardware manufacturers, and the algorithm advantages of the models cannot be deeply integrated with the mechanical performance of the hardware.
Therefore, a company that truly combines software and hardware may actually have greater business value.
Tesla's success precisely demonstrates the superiority of "software - hardware integration". On one hand, Tesla's Model series of cars dominate the new - energy vehicle sales list. On the other hand, its FSD solution has achieved a so - called "generational lead". Similar success stories continue to happen to Tesla. The hardware engineering of Optimus is continuously optimized, and its software capabilities are evolving in tandem. It is precisely because of the huge business value of this software - hardware integration that Elon Musk's "trillion - dollar compensation plan" was approved.
Conclusion
From a business logic perspective, the "marginal approach" of big tech companies is indeed beyond reproach. It offers higher ROI, lower risks, and a more familiar path, and it also maintains organizational stability. It avoids the long - term trial - and - error in the deep end of the hardware field and preserves the advantages accumulated in the internet era. However, if we take a broader and more long - term view, we will find that the technology industry is entering a new turning point. For the first time, intelligence is not only confined to screens, data centers, and the cloud but is starting to enter the real world in a "physical form".
When intelligence takes on a physical form, it means that the industrial value chain will be redistributed. Software is no longer everything. True competitiveness comes from the synchronous iteration of software and hardware, the continuous acceleration of the data flywheel, and the "real - world execution ability" carved out by engineering and algorithms. This is very different from the internet paradigm and requires a different kind of courage, patience, and a worldview with a "manufacturing foundation".
So, today's big tech companies actually stand at a crossroads: Should they continue to develop smarter software or start learning to deal with materials, machinery, supply chains, and factories? Should they remain as ecological enablers or become true industry shapers? The former is reasonable, while the latter is difficult. However, the latter might hold the key to reshaping the next decade.
Every cyclical leap in the industry rewards those who dare to take a step beyond the "reasonable path". This may be especially true for big tech companies deploying embodied intelligence.
*Independent author Zhao Jiaru also contributed to this article.
This article is from the WeChat official account "Embodied Intelligence Research Society". Author: Peng Kunfang, Editor: Lü Xinyi. Republished with permission from 36Kr.