The First Year of Physical AI: A Trillion-Dollar Gamble on "How the World Works"
In March 2026, AMI Labs, founded under the lead of Yann LeCun, a Turing Award laureate and former chief AI scientist at Meta, announced the completion of a $1.03 billion seed - round financing.
Almost at the same time:
- World Labs, founded by Fei - Fei Li, completed a new round of financing of approximately $1 billion.
- Google DeepMind released the Genie 3 world model.
- Tesla continued to promote the deployment of the Optimus humanoid robot in factories.
These events did not occur in isolation but together point to a clearer trend: AI is moving from "understanding the digital world" to "understanding and acting on the physical world."
If 2024 was the expansion period of large language models and 2025 was the exploration period for the implementation of Agents, then in 2026, the core narrative in Silicon Valley is shifting to a more fundamental question: Can AI truly understand "how the world works" and complete tasks in the real world?
This is not just a change in the technical direction but also means that the industrial value chain is being rewritten. In the past two years, the main battlefield of AI competition has mainly been concentrated in a few high - threshold areas such as models, computing power, and data centers. When AI truly enters the physical world, the competition no longer occurs only at the model level but also extends to hardware bodies, system integration, data collection, simulation environments, supply - chain collaboration, and real - scenario implementation. In other words, Physical AI brings not a single - point breakthrough but a reconstruction of the entire infrastructure system.
For this reason, this round of changes may not only be a new wave of technological enthusiasm but also a rare structural opportunity window for the Chinese - speaking world, especially Chinese entrepreneurs, engineers, and investors. Different from the previous round of competition mainly dominated by large - model training resources and super capital, Physical AI naturally relies more on composite capabilities: one needs to understand algorithms and engineering, be able to do system collaboration, and penetrate into manufacturing, supply chains, and industrial scenarios. Teams with both technical depth, hardware collaboration capabilities, and Sino - US industrial perspectives are more likely to occupy key positions in this new cycle.
In other words, Physical AI is not just a new story told in Silicon Valley. It may also be the most valuable ticket for the Chinese in the next global technological infrastructure change.
01 The Century - long Dispute between Two Routes: LLM Faction vs. World Model Faction
In the past three years, large language models (LLMs) have almost dominated the development path of AI. Its core paradigm is next - token prediction based on massive text data. However, the boundaries of this paradigm are gradually emerging: it can "describe" the physical world but lacks executable understanding, the ability to model causal relationships and physical constraints, and also shows limited performance in continuous decision - making and long - term tasks.
Therefore, a faction represented by Yann LeCun has started to promote another path: World Model - predicting "states" rather than "text." The core difference between the two is that LLM takes text as the learning object and language as the output form, essentially remaining at the level of "cognition and expression," while the world model takes the physical world state as the modeling object, directly pointing to the closed - loop ability of "perception - decision - execution."
This is not just LeCun's judgment. In Q1 2026, the world - model direction saw several key developments almost simultaneously: AMI Labs, with JEPA as the core architecture, clearly bet on the long - term route of "research first, products later"; World Labs started from "spatial intelligence" and tried to make AI truly understand the relationships, occlusions, and physical constraints in the three - dimensional world; Google DeepMind promoted the generation of a real - time interactive dynamic environment through Genie 3 and used it for agent training.
Although the three companies have different paths, they point to the same trend: The next leap of AI is not just to generate better text but to model the world more accurately and take actions in it.
02 The Hardware War: Who is Building the "Body"?
The world model solves the "brain" problem - how AI understands the physical world. But the other half of the Physical AI battlefield is equally fierce: Who will build the "body"?
In 2026, the humanoid - robot track has fully entered the "factory mass - production" stage from the "laboratory demo" stage. Here are some key figures:
Tesla Optimus Gen 3: More than 1,000 units have been deployed in the Gigafactory Texas and Fremont factories to perform parts - handling and assembly tasks. This is the largest - scale deployment of humanoid robots in a factory in human history. Tesla is building a dedicated factory with an annual production capacity of 10 million units in Giga Texas, aiming to reduce the cost per unit to $20,000 - two years ago, the industry average price was still between $50,000 and $250,000.
Boston Dynamics Atlas: The product - version Atlas at CES 2026 is 6.2 feet tall, has 56 degrees of freedom, and can lift a 110 - pound weight. What's more noteworthy is its "soul" - Boston Dynamics announced a cooperation with Google DeepMind to integrate the cutting - edge foundation model into Atlas. The annual production capacity in 2026 has been reserved by Hyundai and Google DeepMind, and a factory with an annual production capacity of 30,000 units is being planned.
Figure 03: Figure AI raised $1 billion at a valuation of $39 billion. Its Figure 02 participated in the production of more than 30,000 BMW X3s during an 11 - month trial run at the BMW Spartanburg factory, moved more than 90,000 parts, and ran for a total of 1,250 hours. Figure 03 is a comprehensive upgrade on this basis, equipped with more than 48 degrees of freedom and a proprietary Helix AI platform.
Mind Robotics: It just announced a $500 million financing in March, focusing on the industrial - scale deployment of AI robots.
However, in this hardware competition, an underestimated link is emerging: the dexterous hand.
The legs of a humanoid robot solve the movement problem, and the torso solves the load - bearing problem. But what really determines whether a robot can work in a complex environment is the hand. Taking Tesla Optimus as an example, the cost of the hand accounts for 17% of the whole machine, about $9,500 - it is the most expensive single component.
The difficulty of the dexterous hand lies in a fundamental contradiction: the space for fingers is too small to accommodate large motors; small motors have insufficient torque, so high - reduction - ratio gearboxes are needed to amplify the force; and high - reduction - ratio gearboxes bring inertial distortion, loss of force feedback, and mechanical wear - these three problems will "poison" the AI learning process at the physical level.
A number of new companies are trying to break through this bottleneck. Some use an axial - flux motor architecture to compress the reduction ratio from 288:1 to 15:1, achieving a fully reversibly drivable dexterous hand; some synchronously design data - collection gloves so that human - operation data can be transferred to robot hardware without loss. These seemingly small hardware innovations may be one of the most critical infrastructures in the entire Physical AI ecosystem.
03 NVIDIA: The "Shovel Seller" in the Physical AI Era
In every technological wave, there will be a "shovel seller."
In the large - model era, NVIDIA became the biggest beneficiary with its GPU and CUDA ecosystem. In the Physical AI era, its role is further upgraded - not only providing computing power but also trying to build a whole set of infrastructure for the robot era.
At the GTC conference in March 2026, NVIDIA released a whole set of platform capabilities around Physical AI, including the vision - language - action model Isaac GR00T for humanoid robots, the Cosmos series for generating large - scale synthetic data, and a toolchain covering training, evaluation, and deployment (such as Isaac Lab and OSMO). These capabilities are not single - point tools but gradually form a complete development and operation system.
Many robot companies, including Boston Dynamics, Caterpillar, Franka Robotics, LG, and NEURA Robotics, have built next - generation systems on the NVIDIA platform.
Its strategy is also very clear:
Do not directly participate in terminal products but become the underlying standard for the entire industry.
If Physical AI is a city under construction, then NVIDIA is providing cement, steel bars, and power grids at the same time.
04 Data: The Rarest "Oil" in Physical AI
In the world of large language models, the Internet provides almost infinite text data. But in Physical AI, a more fundamental problem emerges:
The manipulation data of the real world is extremely scarce.
This makes data one of the most critical and rarest resources in the entire industrial chain.
Currently, the industry is mainly exploring three paths.
The real - data path. Represented by Physical Intelligence, its π0 model is trained based on more than 10,000 hours of real robot - operation data, covering various robot forms and task types, and can complete complex operations (such as folding clothes and assembling cartons). Its open - source behavior essentially provides a set of "manipulation pre - training bases" for the industry.
The synthetic - data path. Google DeepMind's Genie 3 and NVIDIA's Cosmos try to generate a large number of simulated environments through the world model, complete training in the virtual world, and then transfer it to the real world. The core challenge of this path is the sim - to - real gap, but as the simulation accuracy improves, this gap is gradually narrowing.
The human - teleoperation path. Through devices such as data - collection gloves, human operations are directly mapped to the robot system. This method has the highest data quality but still has limitations in terms of cost and large - scale capabilities.
Tesla is trying a hybrid path: continuously collecting human - operation behaviors through factory videos and using them to train the action capabilities of Optimus.
In the long run, the competitive landscape of Physical AI may not depend on whose model is the best but on who has the most and highest - quality physical - world interaction data. Once the data flywheel starts to run, its barriers will increase exponentially.
05 ┃ What the Money is Saying: A Panoramic View of Physical AI Financing in Q1 2026
Numbers don't lie. Here are the key financing events in the Physical AI field in the first quarter of 2026:
[World Model Layer]
· AMI Labs (LeCun) — $1.03 billion seed round, valuation of $3.5 billion
· World Labs (Fei - Fei Li) — $1 billion new round, with Autodesk investing $200 million
[Foundation Model Layer]
· Physical Intelligence — Negotiating a new round of $1 billion, valuation to exceed $11 billion
· RLWRLD — $41 million seed - round extension
[Humanoid Robot Whole Machine]
· Figure AI — Previously raised $1 billion at a valuation of $39 billion (2025)
· Mind Robotics — $500 million for industrial - scale deployment
· Galaxea — $434 million, Series B unicorn
· Humanoid — $290 million seed round, directly a unicorn
· Generative Bionics — €70 million seed round
[Infrastructure and Tools]
· NVIDIA — Continuously investing in the Isaac GR00T / Cosmos platform
· RoboForce — $52 million, a Physical AI labor platform
Based on the above public data alone, the total in Q1 has exceeded $6.4 billion. And this does not include the internal investments of large companies such as Tesla, Hyundai/Boston Dynamics, and Google DeepMind.
The flow of capital shows one thing: Physical AI has passed the "proof - of - concept" stage and entered the "infrastructure - building" stage. Investors are no longer asking "Can robots be used?" but "Whose infrastructure can scale up robots the fastest?"