HomeArticle

Independent variable robot Wang Qian: The large model of embodied intelligence can't copy foreign models.

王方玉2025-05-29 09:00
Wang Qian once founded a quantitative fund in the United States, but "he often couldn't fall asleep at night thinking about making robots." In 2023, he dissolved the fund and returned to China to start a business.

Text by | Wang Fangyu

Edited by | Su Jianxun

Wang Qian has the look of an intellectual and speaks in a calm tone. However, once the topic turns to embodied intelligence, he reveals a "murderous" side:

"If you just follow others, you'll naturally fall behind in technology. It's really unpromising."

"Starting a business requires some determination. If you've already found an escape route from the beginning, your mindset is wrong."

Robotics is what Wang Qian is most passionate about. He graduated from Tsinghua University for his undergraduate and master's degrees and pursued his doctorate at the University of Southern California in the United States. He once founded a quantitative fund company in the United States. But after engaging in quantitative trading, he "couldn't sleep well for quite a while and regretted not continuing his robotics career."

△Image source: Authorized by the enterprise

In 2023, Wang Qian dissolved the fund and returned to China. He founded "Independent Variable Robotics" in Shenzhen.

In less than a year and a half since its establishment, Independent Variable Robotics has completed seven rounds of financing, with a cumulative financing amount exceeding 1 billion yuan. On May 12th, Intelligence Emergence exclusively reported that it had received a new round of financing worth hundreds of millions of yuan from Meituan alone.

2023 was the year when the domestic embodied intelligence track began to flourish. Huang Renxun, the founder of NVIDIA, first predicted that embodied intelligence would be the next technological wave. Both Galaxy Universal and Zhipu Robotics were established in this year.

Compared with these two companies, Independent Variable Robotics didn't gain much attention in the early stage. However, with continuous new financing, it is gradually moving towards the center of the embodied intelligence stage.

An investor from a dual-currency institution told Intelligence Emergence that judging from the financing amount, there is now a clear echelon among domestic humanoid robot startups. There are three companies in the first echelon: Unitree Technology, Zhipu Robotics, and Galaxy Universal, all with financing amounts exceeding 1.5 billion yuan. Independent Variable Robotics, with a financing amount exceeding 1 billion yuan, has entered the quasi-first-tier from the second-tier.

Just like the initial stage of large AI models, there are two completely opposite attitudes in the domestic embodied intelligence field: optimism and pessimism. On one hand, Zhu Xiaohu is pessimistic - "Now every humanoid robot can do somersaults, but where is the commercialization?" On the other hand, investment institutions continue to pour in large amounts of money, and startups are accelerating the mass production process and giving optimistic growth expectations.

Wang Qian is a typical representative of the technology believers.

Since its establishment in 2023, Independent Variable Robotics has firmly chosen the technical route of the "end-to-end unified VLA large model" and promotes R & D at a speed of updating the model every two to three months.

One year later, with the release of the model by the American company Physical Intelligence (PI), VLA became the mainstream route in the industry.

While most other manufacturers' models are still performing simple Pick & Place operations (i.e., Pick for grasping and Place for placing), the WALL - A model developed by Independent Variable Robotics can already enable robots to complete multiple complex and delicate operations such as clothing handling, storage organization, and wire harness arrangement.

△Independent Variable's robot independently makes shaved ice at the GAIE2025 exhibition. Image source: Authorized by the enterprise

The pessimistic view in the market is that "general embodied intelligence is still too early, and the commercialization is unclear." However, Wang Qian's vision of the industry's development process is much faster.

He expects that an embodied intelligence large model at the level of GPT - 3 is expected to appear in about a year. The real commercialization cycle of embodied intelligence will also gradually unfold in the next one to two years.

Currently, the commercialization scenarios of embodied robots mainly come from two markets: scientific research and education, and reception performances. However, in Wang Qian's view, the scale of these two markets is relatively small overall, and they have limited significance for the long - term development of the industry and cannot be regarded as the ultimate target markets. Regarding humanoid robots entering factories to do simple and repetitive work, he even bluntly says that "it's actually just a PR (public relations) act."

Wang Qian believes that to achieve truly valuable commercialization, it is necessary to rely on the improvement of the generalization ability of the embodied intelligence model.

Currently, Independent Variable is not in a hurry to promote commercialization but focuses on improving the model's capabilities. Two - thirds of the company's expenditures are invested in the model and its related businesses.

"Without being modest, Independent Variable is in the leading position in the domestic embodied intelligence model. Investors naturally give some preferential treatment to the first - place. Everyone believes that we can achieve a very high upside and hopes that we can focus more on the grand goal of the general embodied intelligence model." Wang Qian said confidently.

The following is a dialogue between Intelligence Emergence and Wang Qian, the founder of Independent Variable Robotics. The content has been slightly edited:

"The integrated end - to - end model has a higher development ceiling."

Intelligence Emergence: In the past six months, what significant new progress has the company made in terms of model capabilities?

Wang Qian: Our progress is quite rapid. On average, we update the model every two to three months.

Previously, Independent Variable's model was a pure action - output model, with multi - modality input and single - modality output. Starting from October and November last year, we began to develop an any - to - any model, with multi - modality input and multi - modality output. In addition to outputting actions, it can also output language and vision.

Under the framework of full - modality fusion, Independent Variable also conducts long COT (Chain of Thought). We developed the Chain of Thought almost between these two rounds of financing.

In March this year, Google Gemini Robotics announced their progress, which is also a similar approach: any - to - any and COT. Recently, the newly released π0.5 by Physical Intelligence (PI) has also done something similar. So, in fact, we predicted the direction of technological progress very early, and the time when we did this was similar to that of foreign players like PI.

So, we can say that the level of our model is basically on the same level as that of PI and Google. Because we did similar things at similar times and achieved similar effects. While domestic manufacturers are generally just starting to move in this direction, there is still a big gap in progress.

Intelligence Emergence: Has the unified end - to - end VLA large model (Vision - Language - Action Model) become the mainstream technical route?

Wang Qian: Yes, to a large extent, this is affected by the release of the new model by PI in October last year. People can see that end - to - end is a good direction and a major trend.

Now, basically, whether people believe it or not, at least they will wave this flag. However, there is still a big difference in whether they can actually do it well or whether they are really doing end - to - end. At the same time, you will find that there is a lot of so - called "definition theory" in the market, re - "inventing" what end - to - end means.

To add, there are two different approaches to the end - to - end route. One is like Figure's two - layer model path: a high - level VLM for reasoning and planning, and a low - level VLA for the actual action generation part; the other approach is not to make a distinction, an integrated end - to - end model.

We also tried the two - layer model in the early stage, but found that the ceiling of the single - layer model is significantly higher than that of the two - layer model. So, Independent Variable prefers the unified end - to - end paradigm.

△Image source: Authorized by the enterprise 

Intelligence Emergence: What are the technical routes parallel to the end - to - end route?

Wang Qian: There are several parallel routes, but actually, few people are doing them now. It mainly involves using three - dimensional vision or other methods for perception, combined with some traditional controls, to perform Pick & Place operations (referring to grasping and placing).

The above methods may be applicable in some scenarios, such as very simple Pick & Place tasks, including the previous generation of industrial automation scenarios. However, this is obviously not what we are pursuing. Figure and Boston Dynamics used to adopt this approach but have now shifted to the end - to - end route.

Intelligence Emergence: If we compare the current capabilities of Independent Variable's embodied intelligence model to the development stages of large AI models, which stage is it at?

Wang Qian: I think it is still at the GPT - 2 stage. GPT - 3 had some obvious features that our current model does not have enough scale to achieve. The progress of companies like PI and Google in the industry is also similar, which is determined by the objective law of the Scaling Law.

Intelligence Emergence: How long will it take for the domestic large - scale embodied intelligence model to achieve commercialization?

Wang Qian: Actually, it will be about a year at the fastest and about two years at the slowest. I'm referring to real commercialization, where users are actually willing to pay. Of course, commercialization has different stages. To enter the C - end market, such as household nanny robots or indoor service robots, it will take longer, maybe three to five years.

People generally overestimate short - term technological progress and underestimate medium - and long - term technological progress - it will be faster than people think.

Intelligence Emergence: When it comes to embodied model training, people always say that data shortage is a bottleneck. Do you have enough data?

Wang Qian: Data is a time - related issue. For example, when you have no perception or understanding of the embodied model at the beginning, collecting a large amount of data may not be the right solution. Most of the data collected may be useless or of low quality. So, the amount of data collected should match your understanding of embodied intelligence.

Improving the data collection scale is just one aspect. How to improve the data quality and deeply understand what kind of data is needed is another aspect. Independent Variable has done a lot of work on the latter, which is a more efficient way.

Currently, the quality of some open - source datasets and third - party data is generally substandard. If you actually use such data for model training, the model performance will not be very good. These data can be used as a supplement but cannot be completely relied on. Currently, our data is mainly collected by ourselves.

Intelligence Emergence: In this wave of the embodied intelligence boom, domestic startups are generally quite cautious in spending money, as if they are preparing for a cooling - off period. What do you think?

Wang Qian: First of all, Independent Variable is quite cautious in spending money. We will never spend money on unnecessary things. We are doing a long - term and significant thing and need to prepare for possible fluctuations in the industry.

On the other hand, we still need to spend money when it's necessary. Without investment, we really can't achieve anything. If we always wait for foreign open - source results to follow or copy, it's really unpromising, and we will never achieve the ultimate goal of general robots.

The issues of confidence and preparing for the winter actually reflect a lack of ability, which leads to a lack of confidence. If you really have enough ability and judgment, you won't think about this problem in this way. The initial team genes and ability levels will determine many strategic judgments and ways of looking at problems.

Ultimately, why does the trough in the industry come? It's because the industry has not achieved any actual results. Once the results are achieved, there will naturally be a peak. Why not be the company that leads the peak and the investment boom instead of passively adapting to the environment? I think this is the mindset that an entrepreneur should have.

"The value and significance of some commercialization scenarios are questionable."

Intelligence Emergence: How do investors evaluate Independent Variable's technical capabilities? By DEMO videos or on - site real - machine demonstrations?

Wang Qian: We always conduct real - machine demonstrations. Since the day of its establishment, Independent Variable has insisted that real - machine demonstrations are the top priority. There are too many ways to fake videos. Only on - site can you see the real performance of the model. You even need to interact with the robot on - site and conduct some artificial interference to see how the model performs in various extreme situations. This is what really reflects the level of the model.

Intelligence Emergence: At the current valuation level, do investors have commercialization requirements for Independent Variable?

Wang Qian: It depends on the investors. Some investors focus more on how high the upper limit of the embodied intelligence model's capabilities can reach, while others focus more on commercialization. The preferences of different investors vary greatly.

Independent Variable is a bit special. Without being modest, we are in the leading position in the domestic embodied intelligence model. Investors naturally give some preferential treatment to the first - place. Everyone believes that we can achieve a very high upside, so they won't require us to commercialize just for the sake of it. Instead, they hope that we can achieve "valuable" commercialization and focus more on the grand goal of the general embodied intelligence model.

Intelligence Emergence: You haven't officially released the physical product yet. How can you meet the commercialization requirements of some investors?

Wang Qian: Actually, we already have a physical product, but we haven't officially launched it on a large scale. Moreover, our physical product has already been sold and put into use, mainly in service - oriented scenarios. In addition to the current model, we will also launch new physical products.

△Image source: Authorized by the enterprise 

Intelligence Emergence: Is the technology for embodied intelligence to enter the service industry mature now?

Wang Qian: We are still in the POC (Proof of Concept) stage with our seed customers. There is still a good chance by the end of this year to the beginning of next year, but of course, a lot of engineering work still needs to be done. Moreover, we won't be limited to simple Pick & Place operations (i.e., Pick for grasping and Place for placing).

Too simple Pick & Place operations are not helpful for the further training and development of the embodied intelligence model. The previous generation of technology could actually achieve this, and even pure automation technology can meet the requirements. Independent Variable hopes to create scenarios that are diverse, complex, and open enough, which previous technologies couldn't cover.

Intelligence Emergence: If you complete the