HomeArticle

Jensen Huang's opening speech at GTC: "AI-XR Scientist" is here

机器之心2025-11-20 10:20
After an AI has read so many papers, can it conduct experiments?

After reading so many papers, can AI conduct experiments? LabOS: When AI not only can think but also can “see”, “guide”, and “operate” real experiments, a new era of scientific discovery featuring the co - evolution of human and machine intelligence is quietly unfolding.

In a seemingly ordinary biological laboratory, a scientist is preparing a solution under the guidance of XR smart glasses. Prompts appear on the lens in real - time: “Stem cell culture is complete. Please take a sample.” At this moment, a robot automatically takes over the test tube in his hand and starts a vortex mixer for mixing. When the scientist retrieves the cells, the next step of the CRISPR gene - editing process is already synchronized in his field of vision.

The mastermind behind all this is the AI co - researcher scientist with a “world model” of the laboratory scenario — LabOS. It is like a conductor who has an overall view of the situation, using multimodal data as the score to precisely direct multi - agents, human scientists, and experimental robots. In this deeply integrated experimental ecosystem, the three no longer work independently but jointly play a symphony of scientific discovery that is efficient, reproducible, and continuously evolving.

This disruptive human - machine collaboration scenario in the laboratory comes from the breakthrough research personally presented by Jensen Huang, the CEO of NVIDIA, at the GTC Conference in Washington on October 29th. A research team led by Professor Le Cong from Stanford University and Professor Mengdi Wang from Princeton University, in collaboration with NVIDIA, officially launched an intelligent platform system called LabOS, the world's first Co - Scientist (co - researcher scientist) integrating AI and XR (extended reality).

LabOS website: https://ai4labos.com

Paper link: https://arxiv.org/abs/2510.14861

The breakthrough of LabOS lies in its first integration of multimodal perception, self - evolving agents, and extended reality (XR) technology, seamlessly connecting the AI computational reasoning of dry experiments with the real - time human - machine collaborative operation of wet experiments to build an end - to - end closed - loop from hypothesis generation to experimental verification. This not only creates a dynamically evolving “world model” for scientific research but also officially ushers in a new era of scientific discovery featuring the co - evolution of human and machine intelligence.

Professor Le Cong from Stanford University said, “With this breakthrough achieved through cooperation with NVIDIA, we can shorten the work that used to take several years to just a few weeks, reduce research costs from millions of dollars to just a few thousand dollars, and cut the training cycle for top - notch research talents from several months to just a few days. We are very excited to work closely with NVIDIA to present this achievement. What's even more exciting is that this is just the beginning. With the rise of autonomous research laboratories, this innovation is not only changing lives but also saving lives at a faster pace and lower cost!”

Figure 1: LabOS system architecture, integrating the self - evolving AI agent of dry experiments with the human - machine interaction of XR + robots in wet experiments to achieve end - to - end scientific discovery

1. From computational reasoning to physical collaboration: The embodied evolution of AI laboratories

Previous scientific AIs, such as AlphaFold and Deep Research, mostly operate in the pure digital world. They are natural “theorists” but cannot access real physical experiments. The “last step” in the laboratory still highly depends on the manual operation and tacit experience of scientists, which has become a bottleneck for scientific research efficiency and reproducibility.

The breakthrough of LabOS is to build an embodied system for AI to enter the real laboratory. It integrates abstract intelligence with physical operation to create an AI co - researcher scientist with “brain - eye - hand” coordination ability:

The thinking “brain”: A self - evolving AI agent. Based on the previous STELLA framework, LabOS includes four agents: planning, development, review, and tool creation. They can not only decompose scientific research tasks and write analysis codes but also independently create new tools from a vast amount of literature and data through the “tool ocean” module, achieving continuous evolution of reasoning ability. This inherent self - evolution ability enables it to solve novel scientific research tasks through “expansion during reasoning.”

The understanding “eye”: A visual - language model specially designed for laboratories. The team collected more than 200 first - person experimental videos to build the LabSuperVision (LSV) benchmark. They found that even the most powerful general large models perform poorly in understanding fine experimental operations. Therefore, they trained a dedicated LabOS - VLM, which has a much higher accuracy rate than general models in tasks such as error detection.

The collaborative “hand”: A real - time human - robot integrated experimental execution system. Researchers wear lightweight AR glasses to conduct experiments. LabOS analyzes the video stream every 5 - 10 seconds, providing real - time step - by - step guidance, error alerts, and operation suggestions, and coordinating the participation of the LabOS Robot in experimental operations. All interactions are completed through gestures and voice on the XR interface to ensure smooth human - machine collaboration in a sterile environment.

Figure 2: 4D reconstruction of the physical laboratory scenario. LabOS realizes real - time human - robot integrated collaboration through XR glasses

2. How does the world model understand the laboratory?

The complexity of the laboratory environment poses extremely high requirements for AI's visual understanding. To evaluate the laboratory perception and reasoning ability of AI models, the team built the LabSuperVision (LSV) benchmark — which includes more than 200 experimental video sessions recorded from the first - person perspective by researchers wearing cameras, with operation steps, error types, and key parameters annotated by experts. The results were unexpected. Current leading AI models perform poorly on this benchmark: Gemini, GPT - 4o, etc. only scored 2 - 3 points (out of 5) in protocol alignment and error recognition tasks, far from meeting the laboratory application standard.

To solve this bottleneck, the team post - trained a visual - language model (VLM) focused on the laboratory scenario by combining publicly available experimental videos, internally recorded data, and expert annotations. The resulting LabOS - VLM can decode the visual input from XR glasses and align visual embeddings with the language model to achieve interpretation and reasoning of the laboratory scenario. After supervised fine - tuning and reinforcement learning optimization, this model shows significantly improved visual reasoning ability in the scientific environment — for example, in a cell transfection experiment, it can identify in real - time errors caused by the experimenter not following the standard operating procedure (SOP) of the experiment and generate step - by - step guidance. The 235B parameter version of the model has a breakthrough in error detection accuracy of over 90%, far exceeding other general models.

Meanwhile, to further improve the system's understanding of the physical space in the laboratory, LabOS builds a three - dimensional laboratory environment with time perception and semantic understanding for AI. In this environment, AI can not only identify every vessel, device, and sample in the laboratory but also understand their semantic relationships and temporal evolution in the laboratory scenario, knowing which step of the experiment is being carried out, which operations have been completed, which reactions are still ongoing, and what problems have occurred in which step. This high - precision world model also forms the spatial cognitive basis for the LabOS Robot to autonomously perform various experimental tasks in the laboratory.

This complete technical path from data construction, model training to real - time interaction enables the LabOS system to have scientific visual reasoning ability and successfully builds an efficient collaborative closed - loop among AI, humans, and robots in the real experimental scenario.

Figure 3: From the construction of LSV benchmark test data to the training of the LabOS - VLM model, realizing real - time human - machine interaction in the laboratory scenario

3. Three empirical studies of human - machine collaboration: From target discovery to skill inheritance

The paper on LabOS demonstrates the powerful functions of LabOS as a collaborative scientist through three cutting - edge biomedical research cases:

Autonomous discovery of new targets for cancer immunotherapy

The key challenge in cancer immunology is to identify the key genes mediating tumor immune escape. Traditional screening methods are limited by throughput and rely on expert experience for analysis. LabOS shows its full - process research ability in “dry experiment - clinical analysis - wet experiment”: The system first uses CRISPR activation screening technology to autonomously identify and iteratively optimize the candidate gene CEACAM6 for NK cell killing resistance in melanoma cells; then it conducts survival analysis using The Cancer Genome Atlas (TCGA) data to establish the clinical correlation between CEACAM6 gene expression and patient prognosis; finally, it verifies through wet experiments that the activation of CEACAM6 significantly enhances the tumor's resistance to NK cells. This end - to - end closed - loop from computational reasoning to experimental verification reflects LabOS's systematic research ability in target discovery.

Figure 4: Generation and verification of scientific hypotheses in the application mechanism research of LabOS in target discovery

In the mechanism research of cell fusion, a fundamental biological process, LabOS shows strong ability in generating and verifying scientific hypotheses. LabOS autonomously nominates ITSN1 as the core regulatory gene by integrating pathway enrichment analysis, interaction priors, and functional evidence. Subsequently, the research team conducts functional verification of cell fusion in the U2OS cell model through CRISPR interference technology. Quantitative imaging and cell experiment results show that the knockdown of ITSN1 indeed significantly inhibits the cell fusion process. This complete closed - loop from AI - generated scientific hypotheses to wet - experiment verification fully reflects the unique value of LabOS as a co - scientist in promoting mechanism discovery.

Figure 5: Application of LabOS in mechanism research. Skill inheritance in stem cell engineering

The reproducibility of complex wet experiments has long been plagued by ineffable tacit knowledge and operational deviations. LabOS realizes real - time guidance and operation capture in complex experiments such as CRISPR gene editing of stem cells through XR smart glasses and visual reasoning, and can automatically record expert experiments to form standardized digital processes. Finally, it serves as an AI tutor to help novices quickly master key technologies, significantly improving experimental reproducibility and skill inheritance efficiency.

Figure 6: Application of LabOS in stem cell research

4. Future: The “North Star” towards autonomous scientific discovery

The birth of LabOS marks a fundamental leap in the paradigm of scientific discovery. When elaborating on their vision, Professors Le Cong and Mengdi Wang emphasized that LabOS aims to “Scale Science with AI Together” — to expand the boundaries of science together with AI.

Scientific exploration has long been limited by the speed of human cognition and the precision of experimental operations. Traditional laboratories are like isolated islands, relying on unreproducible “craftsmanship” and non - scalable personal experience. LabOS is breaking this shackle: It allows AI and future robots to truly “enter” the laboratory, understand and participate in every experimental link, becoming collaborators working side by side with human scientists. Whether successful or not, every experiment becomes the growth nutrient for this AI Scientist, promoting its continuous evolution. This scientific research ecosystem featuring the co - evolution of human and machine intelligence will fundamentally accelerate the process of scientific discovery.

Introduction to the author team

Le Cong is a professor in the Department of Pathology and the Department of Genetics at Stanford University School of Medicine. He is recognized as one of the founders of the CRISPR gene - editing field. As early as when he was pursuing his Ph.D. at Harvard University, under the guidance of his supervisors Feng Zhang and George Church, he published several landmark papers as the first author in journals such as Science, Cell, and Nature, first confirming that the CRISPR/Cas9 system can achieve gene editing in mammalian and human cells, laying the foundation for the subsequent application of this revolutionary technology. Later, under the guidance of Professor Aviv Regev, he shifted his focus to single - cell genomics. After that, he established an independent laboratory at Stanford and released a series of new technologies combining machine learning, AI, and biomedicine, spanning multiple fields such as gene editing, cell tracking, and immune target discovery. In recent years, he has been committed to accelerating biomedicine by establishing an “AI scientist,” releasing a series of works such as the RNAGenesis base model, the CRISPR - GPT agent, and LabOS — integrating AI agents, XR glasses, and robotics automation.

Professor Mengdi Wang is the director of the Center for AI Innovation at Princeton University and a professor in the Department of Electrical and Computer Engineering at Princeton University. Previously, she was admitted to the Department of Automation at Tsinghua University at the age of 14 and entered the Department of Computer Science at the Massachusetts Institute of Technology (MIT) to pursue a Ph.D. at the age of 18, under the supervision of Dimitri P. Bertsekas, a member of the US National Academy of Engineering. One year after graduating with her Ph.D., she started teaching at Princeton and served as a doctoral supervisor, becoming the youngest tenured professor at Princeton.

This work was led by Professor Le Cong from Stanford University and Professor Mengdi Wang from Princeton University. Dr. Zaixi Zhang from Princeton University, David Smerkous from Stanford University, Dr. Xiaotong Wang from Stanford University, Dr. Di Yin, etc. are the first authors.

This article is from the WeChat official account “Machine Intelligence”, and is published by 36Kr with authorization.