HomeArticle

SpeedLight Photosynthesis and Legend Capital jointly led the investment. "Independent Variable Robot" completed hundreds of millions of yuan in financing within one month to accelerate the training and iteration of the embodied intelligent large model. | 36Kr Exclusive

王枪枪Zach2025-02-17 15:44
Since its establishment, it has chosen the "end-to-end large model with the unification of the cerebrum and cerebellum" route.

Text | Wang Fangyu

Editor | Su Jianxun

Recently, 36Kr has learned that the embodied intelligence startup "X Square Robot" has completed a several-hundred-million-yuan Pre-A++ round of financing. This round of financing is led by Lightspeed China Partners and Legend Capital, with the Beijing Robotics Industry Fund and Shenqi Capital participating. The financing will be used for the training of the next-generation unified embodied intelligence general large model and the implementation of scenarios.

X Square Robot was established in December 2023 and is committed to achieving a general-purpose robot by researching and developing an embodied intelligence general large model. In November 2024, 36Kr reported that it had completed a hundred-million-yuan Pre-A and Pre-A+ rounds of financing.

The ultimate goal of a general-purpose robot is to autonomously perform tasks through interaction, perception, and action like a human, with efficient generalization and migration capabilities. The core to achieving this goal lies in the general embodied intelligence large model of the robot. Overseas, technology companies such as Skild AI, Google DeepMind, and Physical Intelligence (PI) are actively deploying in this field.

Embodied intelligence can be mainly divided into the cerebrum (cognition and decision-making) and the cerebellum (motor control). Currently, domestic enterprises are exploring different development paths: some focus on the cerebrum to enhance the robot's language understanding and planning capabilities; some focus on the cerebellum to optimize motion control such as walking and grasping actions.

Some enterprises also choose the end-to-end route of unifying the cerebrum and cerebellum, which is also the choice of foreign leading technology companies such as Physical Intelligence (PI) and Skild AI.

X Square Robot has chosen the "end-to-end large model that unifies the cerebrum and cerebellum" route since its establishment.

Wang Qian, the founder and CEO of the company, told 36Kr that a true embodied intelligence large model should cover the complete process from perception signal input to action output by one model, without artificial layering or module division. This is the real solution to achieving general embodied intelligence.

"Although the traditional hierarchical architecture can achieve optimization in specific tasks, it is difficult to adapt to the dynamic changes in a complex environment. The end-to-end solution enables the robot to directly map from perception to motion, forming an efficient feedback loop, thereby having stronger autonomous learning and adaptation capabilities in multiple tasks and scenarios."

In China, among the manufacturers that choose the end-to-end model, the technical routes are also differentiated: some manufacturers choose to prioritize training small models for specific tasks or a single scenario; X Square Robot, from the beginning, uses multi-task and a large number of scenarios for training to improve the universality and adaptability of the model.

Wang Qian said that currently in the industry, for complex tasks that significantly exceed a single operation, almost all the better implementation results are completed by the embodied intelligence large model. Small models design a specific model structure for each task and often can only perform the most basic single operation and cannot achieve generalization.

On the contrary, large models focus on how to achieve the scaling-up of the model through an engineering approach until it becomes completely universal. The two technical stacks are completely different, and relying on the accumulation of small models cannot effectively transfer to achieve a large model.

In November last year, X Square Robot announced the realization of the WALL-A model of the Great Wall series (GW), the currently largest parameter-scale embodied intelligence general operation large model in the world. This model can achieve generalization and migration of various physical environment variables and action modes with fewer samples in terms of universality and generalization, and at the same time has advantages in long-sequence complex operations.

Wang Qian introduced that after several months of iteration recently, the ability of the WALL-A model has reached the same level as Skild AI and Physical Intelligence, and some capabilities are even stronger than foreign competitors.

From the perspective of task complexity, WALL-A can complete fine operations such as zipping and folding clothes, showing adaptability to complex topological structures and complex physical interactions in a random environment. From the perspective of the accuracy rate of complex tasks, it performs well in complex flexible operations such as folding and hanging clothes, with a task success rate of more than 90% in several-minute-level tasks.

In addition, the general embodied intelligence large model of X Square Robot can also achieve semantic navigation without the need for maps and depth input, and can perform immediate decision-making and real-time instruction following based on videos, and also has the ability of autonomous environment exploration.

In terms of the team, the core team members of X Square Robot are located in Shenzhen. The software algorithm team has a dual background in Robotics Learning and large models. In terms of hardware, the company has gathered a group of core technical backbones and executives from leading hardware companies, with mature engineering capabilities and mass production experience.

The founder and CEO Wang Qian graduated from Tsinghua University with a master's degree. He is one of the earliest researchers in the world to propose the attention mechanism in neural networks. During his doctoral period, he participated in several Robotics Learning research projects in a top robot laboratory in the United States, and his research experience covers almost all fields related to robot operation and home service robots. The co-founder and CTO Wang Hao is a doctoral student in computational physics at Peking University. He once served as the algorithm leader of the Fengshenbang large model team at the Guangdong-Hong Kong-Macao Greater Bay Area Digital Economy Institute (IDEA Institute), leading the research and development of the first ten-billion-level large model in China and one of the earliest hundred-billion-level large models, Ziya.

Investor Views:

Cai Wei, Partner of Lightspeed China Partners, said: We invest in X Square Robot because we value its leading technology layout and differentiated competitiveness in the field of embodied intelligence. The company's independently developed end-to-end embodied general large model is in a leading position in China in terms of generalization and intelligence. We believe that as embodied intelligence becomes the core of the next generation of robot revolution, X Square Robot is expected to become an important participant in the global track with its technical universality, team execution ability, and industrial resource integration ability.

Zhu Jia, Partner of Lightspeed China Partners, said: X Square Robot is the leader of the end-to-end robot large model in China. It not only leads significantly in model generalization but also conducts a large amount of independent research and development in the R & D tool chain, which is a common feature of past successful hard-tech startups. Interestingly, both the founder Wang Qian and the founder of Deepseek, Liang Wenfeng, come from a quantitative strategy background. We look forward to X Square Robot having the opportunity to become the Deepseek in the field of embodied intelligence brain.

Ji Haiquan, Managing Director of Legend Capital, said: Currently, embodied intelligence is at a critical turning point in its development, and its application prospects are extremely broad, with the potential to reshape the development pattern of multiple industries. X Square Robot focuses on the research and development of the embodied intelligence general large model and firmly chooses the technical route of the "end-to-end large model that unifies the cerebrum and cerebellum". This technical direction is extremely innovative and forward-looking. Legend Capital has long been concerned about the cutting-edge technology field, conducting in-depth research and active layout in the AI and embodied tracks. In this investment in X Square Robot, we look forward to working hand in hand with the team to help it continue to deepen its efforts in the field of embodied intelligence, accelerate technological innovation and scenario implementation, and promote the Chinese embodied intelligence industry to a new height.

Liang Wangnan, General Manager of Beijing Guorui Fund, said: The Beijing Robotics Industry Fund is optimistic about the broad market space of the future embodied intelligence industry. With the development of artificial intelligence large models and embodied intelligence technology, the task planning and execution capabilities of robots are entering a period of rapid improvement, and intelligent general-purpose robots are expected to become a new generation of human-computer interaction terminals. X Square Robot's unified end-to-end basic large model based on zero-sample or few-sample post-training has shown good model generalization capabilities and has shown good completion results for complex long-range tasks. The Robotics Fund will continue to support the development of X Square Robot by relying on the industry-university-research system and rich downstream scenarios in Beijing.

Liang Mingshu, Managing Partner of Shenqi Capital, said: We have long been optimistic about the technical advantages and growth potential of X Square Robot in the field of embodied intelligence. Since its establishment, the company has firmly chosen the technical path of the end-to-end unified embodied large model. This model not only sets an industry benchmark in parameter scale but also shows excellent leadership in key dimensions such as cross-scenario migration efficiency and long-cycle complex task processing. During our contact with the company, we have witnessed the team's rapid and unexpected iterations in model optimization, data accumulation, and product implementation. We look forward to the company continuing to lead the technological breakthroughs in embodied intelligence. Shenqi Capital will also continue to focus on the innovative applications of embodied intelligence, combine its own rich industrial scenarios, empower outstanding enterprises, and jointly promote the technology towards a wider range of inclusive values.