Beyond Capital Frenzy: The Valley of Death and Real Inflection Point of Embodied Intelligence
Over the past period, discussions and investments in "embodied intelligence" have almost reached an unprecedented state of "madness." Leading enterprises have made heavy bets, the venture capital circle has been actively involved, and industrial giants and startups have been competing on the same stage. For a while, "embodied intelligence" has become the darling of capital and the media. In the market, large - scale financing deals worth billions have frequently emerged, and the entire industrial chain, from upstream to downstream, is in full swing. It seems that those who are a step slower will miss out on the dividends of the era.
However, amidst the craze, the calm voices have not disappeared.
As the Economic Daily put it, we should stop the rush of "Artificial Intelligence +". This warning not only points out the "growing pains" of China's AI industry but also sounds an alarm for the current global competition in embodied intelligence.
Driven by capital, policies, and technology, we need to look back and think about what really deserves long - term attention and in - depth consideration, beyond the "financing boom" and "star enterprises".
The wave of embodied intelligence has indeed brought about room for imagination and industrial opportunities.
It is not only a milestone for AI to move from the virtual world to the physical world but is also regarded as "the next general - purpose technology platform". However, if we only focus on the financing rankings, Demo videos, and PPT stories, and ignore the sustainability of innovation, the real boundaries of technology, the ecological health of the industry, and social ethical responsibilities, the so - called "boom" will soon fade away like previous technological bubbles, even bringing greater disappointment and waste of resources.
Therefore, this article hopes to go beyond the surface logic of "capital frenzy" and "exploding tracks" and turn our attention to more in - depth and sustainable issues such as technological innovation, industrial implementation, responsibility governance, and social value.
Let's reflect on what innovative propositions are worth considering behind the excitement. Beyond capital and technology, what kind of industrial patience and cultivation mechanisms do we need? Only in this way can the in - depth integration of AI and the physical world truly bear fruit.
Behind the Capital Carnival: The Real Hierarchy of the Embodied Intelligence Industry
The long list above continues. From 2024 to 2025 in the field of embodied intelligence, the capital market has witnessed a rare collective carnival, and the single - deal financing has repeatedly reached new highs.
Whether it is Figure AI, 1X, Boston Dynamics in Europe and the United States, or domestic new stars like Zhipu Robotics, Fourier Intelligence, and Unitree Robotics in China, they are all accelerating their expansion with the help of capital and have become the leading characters in industry news.
Large companies and innovative unicorns have frequently reached strategic partnerships, and top resources are rapidly gathering towards a few leading players. The Matthew effect in the industrial chain, from upstream to downstream, is intensifying. "Concentration at the top and binding with giants" has become the main theme of this cycle.
Undoubtedly, China plays a crucial role in the global development of embodied intelligence. According to McKinsey's analysis, if the current trend continues, the global market size of embodied intelligence will reach $370 billion by 2040, and the Chinese market will account for 50% of the share.
However, behind the excitement lies a colder and more complex industry reality. With the influx of large - scale capital, the "80/20 divide" in the field of embodied intelligence has become more obvious.
Enterprises that have obtained financing and resources can quickly expand their technical teams, overcome key bottlenecks, increase market investment, and even make early - stage layouts for cross - border ecosystems. However, a large number of small and medium - sized enterprises have been marginalized in this wave of capital. Their R & D progress is restricted, financing channels are tightened, and market space is further compressed.
In a sense, the reshuffle period of the industry has arrived. Whether an enterprise can survive the cyclical fluctuations of capital has become the key threshold for it to "live long and go far".
The stacking of capital and soaring valuations do not necessarily mean a simultaneous leap in innovation capabilities. Looking back at this wave of financing in embodied intelligence, we can see that a large amount of funds have flowed more towards "large - scale production", "market occupation", and even "star teams" themselves. Patient capital truly focused on breakthroughs in underlying technologies and long - term ROI is still scarce.
So far, there are still very few commercial cases that can be called "killer applications". Whether it is humanoid robots or multi - functional mobile platforms, most applications are still in the small - scale pilot or demonstration verification stage.
"Financing boom ≠ Innovation breakthrough" and "Demo ≠ Productivity" have become louder and louder reflective voices within the industry. Although the promotion of capital is important, only those teams that can truly survive the industrial cycle, are willing to wait for technology to mature, and dare to delve into complex scenarios are likely to gain a foothold in the next round of industrial upgrading.
The "Space Revolution" of Embodied Intelligence: From Stacking to Evolution
Any real - sense technological revolution is far more than just "functional addition" or simple stacking and upgrading. This is also true for the latest breakthroughs in the field of embodied intelligence. The deepest innovation in the industry is no longer limited to the mechanical combination of large AI models and hardware but stems from the qualitative changes in spatial intelligence, 3D world generation, multi - modal perception, and reasoning abilities.
The above figure outlines the evolutionary path of AI's ability from perceiving the world to deeply integrating and transforming the physical world.
Spatial intelligence enables AI to understand and reconstruct the 3D spatial structure, which is like "opening the eyes" of AI. On this basis, embodied intelligence allows AI to achieve dynamic interaction with the real environment through perception, movement, and feedback, truly "experiencing" the world. Finally, physical AI represents that AI can not only recognize and learn but also be deployed and collaborate in complex platforms and ecosystems, thus promoting profound changes in the real world.
Represented by the cutting - edge project World Labs in spatial intelligence, AI systems have evolved to be able to complete complex spatial cognition and dynamic reasoning based on multi - source inputs such as images, texts, audios, and videos. They can not only "see" but also "understand the environment". They can even deduce and restore the complete 3D world structure based on a 2D picture.
This leap in spatial perception and dynamic prediction capabilities provides a new general - purpose foundation for fields such as industrial robots, virtual reality, and autonomous driving. It also means that machines are no longer passive "mechanical bodies" but intelligent agents truly capable of "adapting to the world".
The rise of multi - modal fusion and "interaction intelligence" is leading AI into a new stage of development. Compared with traditional robots that can only process single - modality perception information, today's embodied intelligence is evolving towards an interactive intelligent agent with "full perception, full dialogue, and full feedback".
AI can not only understand images or voices but also integrate multiple signals to achieve natural language dialogue, environmental perception, instant response, and dynamic adjustment of complex tasks. This multi - modal cognitive ability is giving rise to new AI species with more "action ability" and "transfer ability" than ChatGPT.
More importantly, this revolution in spatial intelligence and multi - modal cognition forces the industry to break away from the old path of "stacking hardware and competing on parameters". If China's embodied intelligence industry only pursues the expansion of hardware scale, parameter stacking, and PPT - style innovation, it will ultimately miss the strategic window of global spatial intelligence and software - hardware collaboration.
Truly internationally competitive "digital labor forces" must be new - type intelligent agents that evolve continuously through self - developed large models, simulation training, and open ecosystems. They are not robots that "look like humans" but "super assistants" capable of understanding space, transferring tasks, and adapting to situations.
Embodied intelligence ≠ Humanoid robots. In the future, embodied intelligence is not just about manufacturing robots that "look more like humans" but about creating digital labor forces that can understand the world, adapt to changes, and unleash creativity. This technological transformation from "stacking" to "leaping" will be the key variable for China's industry to break through international competition.
From Performance to Implementation: How Can Embodied Intelligence Cross the "Valley of Death"?
While the embodied intelligence industry is booming, the gap between reality and imagination is constantly widening.
According to Gartner's analysis, as shown in the above figure, embodied intelligence is in the early climbing stage of innovation, and it will take at least 2 - 5 years to reach the application stage. The most common phenomenon in the industry is that there are far more dazzling Demo videos than actual application values. Capital and the media are keen on chasing novel scenes such as "humanoid robots dancing" and "dexterous hands folding paper", but there are few cases that truly move towards large - scale production and social services.
Behind this is a long and tortuous technological and commercial "valley of death". Whether innovation can cross this valley determines the success, failure, and future of an industrial revolution.
Technological bottlenecks are always the first hurdle in front of embodied intelligence. Whether it is battery life, the precision of dexterous hands, or the generalization ability of AI models, the industry is facing real challenges. Even in the world's top - notch laboratories, it is already difficult to achieve a 90% success rate in training scenarios, but there is still a significant gap from the 99% stability required at the industrial level.
The shortage of data has further restricted the continuous evolution of AI models. The richness, complexity, and variability of the real world are far from being easily replicated in a simulated environment. These shortcomings directly lead to a longer ROI cycle and greater difficulty in commercial implementation.
Under the high expectations of capital and society, humanoid robots have been given too many illusions of "killer applications". However, a calm observation shows that the scenarios that can truly create customer value are extremely limited. For many real - world tasks, traditional automation solutions are even more efficient and cost - effective than humanoid robots.
Whether it is warehousing and logistics, manufacturing and assembly, or medical care, most current embodied intelligence products are still in the "small - batch pilot" or "exhibition hall demonstration" stage. There is still a long way to go before they can widely replace human labor and promote a leap in social productivity.
The deeper challenge is that the industry may be at the threshold of a "plateau period". More and more experts are beginning to reflect on whether the current AI methodology is sufficient to support embodied intelligence in crossing the commercialization threshold.
Without a breakthrough in new paradigms, it is difficult to achieve a qualitative change only by "adding parameters" and "competing on hardware". In a sense, the ultimate goal of embodied intelligence is not just to imitate human appearance or actions but to expand human capabilities and stimulate social creativity through the in - depth integration of AI and the physical world.
Embodied intelligent robots do not need to look like humans but should complement human capabilities. In the future, embodied intelligence should become a reliable assistant, partner, and creator for society and families, rather than just a human - shaped tool. The real driving force for industrial upgrading comes from cross - border imagination and original design, rather than the mechanical superposition of scale, capital, or policies.
Conclusion
The end of every technological wave is the ebb of a bubble. When the noise of capital gradually returns to rationality, only technological innovation that can cross the cycle, industrial collaboration, social responsibility, and human imagination can enable embodied intelligence to truly move beyond the "hype" and towards "greatness".
Although the current embodied intelligence industry stands at the forefront of the global innovation landscape, there are still numerous challenges and vast room for improvement before it can achieve large - scale implementation and deeply transform production and life.
References: How did the financing boom in embodied intelligence heat up? Source: China Venture Capital News. Will embodied AI create robotic coworkers? Source: McKinsey. Embodied AI: How the US Can Beat China to the Next Tech Frontier. Source: Hudson Institute.
This article is from the WeChat official account "Internet of Things Think Tank" (ID: iot101). Author: Peng Zhao. It is published by 36Kr with authorization.