How will Physics AI transform the robotics industry? A full record of the closed - door meeting between NVIDIA and the founders of Unitree and Galaxy Universal.
Jensen Huang mentioned in multiple speeches this year that NVIDIA is actively deploying "Physical AI."
Physical AI will enable autonomous machines such as robots and self-driving cars to acquire motor skills, thereby helping them understand and interact with the real world. Jensen Huang emphasized that Physical AI will bring revolutionary breakthroughs to the robotics field and straightforwardly stated: "We have entered the era of AI inference, and the next wave will be Physical AI."
At the 2025 World Robot Conference, Rev Lebaredian, the vice president of NVIDIA Omniverse and simulation technology, said that Physical AI will leverage the real economy worth trillions of dollars. Compared with the $5 trillion scale of the IT industry, the total volume of physical industries such as manufacturing, logistics, and healthcare is more than a hundred times larger. If robots can connect computing power with these industries, it will greatly enhance productivity and bring exponential changes.
After the conference, Rev Lebaredian of NVIDIA, along with the founders Wang He and Wang Xingxing of his robotics ecosystem partners Galaxy Universal and Unitree Technology, as well as multiple media outlets, held a closed-door exchange meeting to further discuss the future development path of Physical AI.
At the meeting, Rev Lebaredian highly recognized the development of the Chinese market in the field of Physical AI. He told Tencent Technology, "China has unique scale and talent advantages in the fields of Physical AI and robotics, forming a unique ecosystem. China not only has profound professional capabilities in manufacturing electronic hardware and key components of robots but also has the world's leading manufacturing scale. These advantages have laid a solid foundation for the rapid development of the Physical AI and robotics industries."
The following is the complete transcript of the exchange meeting:
Rev Lebaredian, vice president of NVIDIA Omniverse and simulation technology
Rev Lebaredian, vice president of NVIDIA: Physical AI, an intelligent revolution that brings computing into the real world
In the past three or four decades, we have established the computer and IT industries, which have amplified the capabilities of various industries. However, the impact of computing has mostly remained in the "information space" - that is, content that can be digitized, such as languages and various encodable information.
The emergence of the Internet has truly brought computing technology into everyone's life, connecting all people and bringing decades of growth. From a global market scale perspective, compared with the total scale of the IT industry, which is approximately $5 trillion, although this is huge, it is only a small part compared to the total volume of over $100 trillion in all global industries. Other industries are more valuable because they deal with the "atoms" of the real world - areas involving the physical world such as transportation, manufacturing, supply chains, logistics, healthcare, and pharmaceuticals.
Today, with the emergence of artificial intelligence, we finally have the ability to endow machines with "physical intelligence" and truly connect the physical world with the information world. In other words, the power of computing is no longer confined to the $5 trillion information market but can enter the $100 trillion physical world market. And the bridge for this is robots. With robots, we can bring computing and artificial intelligence into the real world and create intelligent agents that can understand and change the physical environment.
China is the best place to achieve this leap because it has unique conditions:
- Top AI talent: Nearly half of the world's artificial intelligence researchers and developers are in China, including the best talents from top universities.
- Electronics and computing technology capabilities: China not only has R & D capabilities but also has an unparalleled global electronics manufacturing industry, which is crucial in the fields of Physical AI and robotics.
- A large manufacturing foundation: There are real scenarios for large-scale deployment and testing of robots here, which can quickly collect data, iterate algorithms, and allow robots to continuously evolve.
Therefore, it is not surprising to see so much energy, capabilities, and enthusiasm at the World Robot Conference.
NVIDIA has also contributed a unique piece to this puzzle. We have long dreamed of participating in solving this problem and have been working hard for a long time. In the robotics field, we have built three types of computers:
- Robot body computers: Embedded inside robots, such as the computers in self-driving cars or humanoid robots. The Jetson Thor, specially designed for humanoid robots, belongs to this category. At this year's WRC exhibition, you could see them on Galbot and other exhibited robots.
- AI factory computers: Before using the robot body computer, its "brain" must be developed first. It relies on DGX and HGX systems to process massive amounts of raw data, generate Physical AI algorithms, Physical AI models, and neural networks, and then deploy them to robots.
- Simulation computers: Data from the physical world cannot be directly obtained from the Internet and can only be acquired in two ways: collection by real-world sensors; generation through computer simulation based on physical laws and world rules. Simulation can not only generate data but also test robots before deployment to ensure their safe operation in the real environment, and the testing speed can be faster than real time.
In the robotics field, NVIDIA has a complete Isaac platform, which combines hardware with the software stacks required by the three types of computers, including: runtime and computing environments, simulation tools, and training frameworks. NVIDIA Jetson Thor is a supercomputer specially designed for intelligent inference agents in the physical world (especially robots), and Jensen Huang calls it a "real-time inference machine."
Highlights of Jetson Thor's performance:
- Its computing power is 7.5 times that of the previous generation Jetson Orin, approaching 10 times;
- The performance per watt is increased by 3.5 times;
- The CPU performance is increased by 3.1 times;
- The I/O throughput is increased by 10 times, meeting the high-bandwidth perception requirements.
- The Isaac platform also includes NVIDIA's simulator and simulation framework:
- Isaac Sim: Environment and sensor simulation, robot testing, and generation of synthetic data.
- Isaac Lab: A simulation platform for reinforcement learning.
- NVIDIA Cosmos: A world foundation model and framework that supports the construction of AI that understands the physical world and combines with simulators such as Omniverse to generate more accurate and large-scale data.
Although the world foundation model is still in its infancy and cannot fully understand the world, it is already very useful and has brought new capabilities to robot R & D.
Wang Xingxing of Unitree Technology: The co - evolution of AI and robots towards the next technological era
Wang Xingxing, CEO of Unitree Technology
In the past few years, we have attached great importance to the development direction of humanoid robots. In a sense, I regard humanoid robots as an important carrier of general - purpose robots. As is well known, General AI is currently the most mainstream development direction globally, and true General AI must rely on robots, especially general - purpose robots, when performing tasks.
In comparison, humanoid robots are currently the most ideal form of professional robots. Although they seem more complex, their actual structure is not as complicated as imagined. In essence, they are composed of several joint motors connected in series. Therefore, the structure is relatively simple, unlike tracked vehicles or other forms of robots, which are actually more complex.
I have always believed that when General AI matures on a large scale, everyone can easily build a humanoid robot, just as people can buy computer components to assemble a computer today. In the future, if AI is powerful enough, the requirements for hardware will become lower and lower.
We launched a robot in May last year, with a price of about 99,000 RMB at that time, and it still has strong market competitiveness to this day. It has an excellent number of joints and flexibility performance, and after its launch, its architecture has become a relatively mainstream design configuration globally.
In the second half of last year and this year, the new products of many emerging robot companies are similar in architecture to this one, with only differences in appearance. Our designed shape is smooth and the structure is simple, while other shapes may be more complex and less aesthetically pleasing, so this product has strong competitiveness in the market.
Recently, we launched a new version. Although the painting is a bit fancy, we hope that customers can freely modify and paint the appearance, such as changing the color or adding personalized decorations. Many customers will dress, hat, or wear wigs on the robot during outdoor live broadcasts, creating a variety of looks. The customizability of appearance and shape is crucial for the customer experience. The price of this new version is about 39,000 RMB, with strong global competitiveness and excellent performance. It is currently available in stock, and mass production is expected to be completed by the end of the year.
In addition, we recently launched the A2 robotic dog. Its biggest feature is that it achieves a large load - carrying capacity under a compact and lightweight design. It weighs about 37 kilograms, can carry a continuous load of up to 30 kilograms, and has a range of 20 kilometers when unloaded. Its appearance draws on previous design experiences, is more sci - fi, and has dust - and water - proof performance. We have always hoped that robots can replace humans in industrial scenarios to complete heavy, dangerous, or repetitive tasks. Our robotic dog has achieved 24 - hour continuous operation in some public welfare projects and has automatic charging and patrol detection functions.
At the end of last year, we upgraded our wheeled robot. This product is relatively large, weighing about 70 - 80 kilograms, so it is inconvenient to use in some scenarios. Therefore, we launched a smaller, dust - and water - proof version suitable for various indoor and outdoor scenarios. Although it is large in size, it still has excellent flexibility. Usually, small robots are more flexible, while large robots are less flexible, but we have still ensured good motion performance at a relatively large size.
In January this year, our robots appeared on the CCTV Spring Festival Gala. The biggest highlight was the fully automatic formation dance. It is equipped with three lidars on its head, which can automatically map and change formations. To adapt to the stage performance, we handed over the background control rights to the stage console to achieve millisecond - level synchronization between music and actions. A total of 16 robots participated in this project, which were uniformly connected to our background server and then connected to the stage system. The biggest challenge of this project lies in multi - robot collaboration and complex programming maintenance. Currently, these robots perform daily at the MGM Macau.
In terms of action learning, we train by collecting human motion data and combining it with deep reinforcement learning. Different from language model training, action training only requires a small amount of real data, and the rest is completed by reinforcement learning. We mainly use NVIDIA's Isaac Sim platform for training and have mastered various actions such as dancing, jumping, and somersaulting. The current biggest factor limiting robots from performing more complex actions is not the algorithm but the physical limits of hardware. For example, to increase the running speed from 3 - 4 meters per second to 10 meters per second requires extremely high hardware improvements.
We also attach great importance to the R & D of robot upper limbs and hands. Currently, we have independently developed a dexterous hand with about 20 degrees of freedom, aiming to enable robots to truly perform daily tasks, rather than just performing demonstration actions. We hope to achieve natural interaction in the next one to two years, for example, directly instructing a robot to pour water for someone without prior adaptation.
At the end of May this year, we cooperated with CCTV to hold a robot fighting competition, which lasted about 1.5 hours and involved 4 teams. The algorithm complexity of the fighting competition is higher than that of dance or kung - fu performances because the action combinations are random and there are strong interferences, requiring smooth action connections and free combinations. Our goal is to achieve "arbitrary real - time generation of arbitrary actions" in the future.
In addition, we also launched the R1 robot, which weighs about 25 kilograms, is lightweight and safe. Although it is small in size, it has strong power performance and is mainly targeted at industrial applications. Its algorithm is similar to that of humanoid robots, but because quadruped robots have better stability, they can perform more intense actions without being easily damaged and have strong obstacle - crossing capabilities.
Looking back, the development of AI and robot technology has always been the result of global collaboration. Multiple forces, including NVIDIA, have been promoting global cooperation in the fields of robotics and AI. Before the popularization of general - purpose intelligent large models and robots that can truly perform tasks, we still need to work together to push humanity into the next technological era. I believe that AI and robot technology will, like the invention of electricity and the steam engine, elevate human civilization to a new height.
Wang He of Galaxy Universal: Synthetic data is the key to the rapid implementation of embodied intelligence
Wang He, CEO of Galaxy Universal
All the robot companies present today, including NVIDIA and Galaxy Universal, share the common goal of building general - purpose robots. Such general - purpose robots will become a key and revolutionary product in the next market worth trillions of dollars or trillions of RMB.
There are several core elements behind this revolutionary product:
- The first element is the robot's body;
- The second element is the embodied intelligence model that drives it;
- Behind the model is data support - what kind of data can train such capabilities.
Next, I will share Galaxy Universal's exploration and achievements in these aspects in turn and introduce the finally implemented products.
What differentiates Galaxy Universal from other companies is that our robots are not completely humanoid but in the form of wheeled robots with two arms and two hands. We use a wheeled chassis, focusing on high endurance, industrial - grade safety, and the ability to achieve large - scale and reliable delivery. The Galaxy Universal G1 robot made its debut in May 2024. After more than a year of iteration, it has now reached the standards for large - scale autonomous commercial use in terms of automatic charging, operation smoothness, and stability.
We deployed the NVIDIA Jetson Thor chip in humanoid robots in China and were also one of the first companies globally to receive this chip. At this WRC conference, we achieved on - site deployment. In the demonstration, the robot equipped with this chip showed smooth motion performance and real - time visual processing and motion planning capabilities for cargo boxes, with a significant increase in speed, and was unanimously evaluated by the on - site audience as "the fastest humanoid robot." This is supported by the powerful chip.
The reason why our robots can navigate efficiently in complex environments is due to the large - scale embodied large model VLA developed by Galaxy Universal over a long period. Among them, the large - scale navigation model can move autonomously in the scene with just one sentence instruction. Just before Children's Day, we globally launched TrackVLA, which does not require mapping and can follow people in any complex scene. It can interact in natural language, shuttle through obstacles, and operate autonomously throughout the process, and can stably follow even under the interference of crowds.
In terms of upper - body operations, we launched the grasping basic large model Grasp VLA, achieving real - time closed - loop generation of grasping actions. Under various lighting conditions and challenging backgrounds, it can achieve zero - sample grasping of specified objects without prior training on the object. This lays the foundation for future "natural language + immediate execution."
Based on Grasp VLA, we created a retail - scenario application. Whether it is bottled, bagged, bulk, hanging, or soft - body items, the same model can complete grasping and delivery. This is the world's first end - to - end retail large model that can handle more than 50 different object placements and covers everything from rigid bodies to soft bodies.
Galaxy Universal was able to launch multiple basic large models