StartseiteArtikel

AI "seeing" experiment, a disruptive breakthrough at Harvard. With an AR glasses, novices can instantly become senior experts.

新智元2025-11-18 20:16
The integration and collaboration of AI and humans reshape the scientific research and manufacturing processes.

[Introduction] When AI can "see" the details of the laboratory, "hear" every reaction of the researcher, and "perceive" every change in the experimental progress - its reasoning will no longer be confined to the silicon-based world. At that time, AI will directly participate in and change the physical reality through human hands. It may become the most diligent and reliable "intelligent partner" in the laboratory. Today, this scenario is unfolding in the clean rooms of micro-nano processing and life science laboratories. AI can not only understand the birth of electronic devices but also begin to insight into the growth of cells and organoids. From chips to cells, from materials to life, the boundaries of "human-machine co-embodiment" are being redefined.

What are the most frustrating moments in scientific research and manufacturing workshops?

Imagine a scenario:

Xiao Li, a novice just starting out, is in the clean room of flexible electronics manufacturing.

In a typical reactive ion etching step, the Standard Operating Procedure (SOP) requires the parameters to be set to 30 seconds and 50 watts.

He clumsily entered 300 seconds and 100 watts without realizing it; the senior engineer beside him was busy recording data and didn't notice this mistake either.

If the wrong parameters are run, the entire batch of devices may be directly scrapped, and days of preparation work will be in vain.

Or:

Researcher Lao Wang spent 3 months exploring a new material synthesis method.

However, the experimental log was scattered. When a novice wanted to replicate it, they were at a loss facing the vague description of "moderately adjust the temperature" in the SOP and had to spend several more months on trial and error.

Similar situations also occur in life science laboratories:

In front of the microscope for organoid culture, graduate student Xiao Zhang has just completed the cell digestion step.

She digested the cells for 40 seconds longer than the time specified in the SOP, resulting in a decrease in cell fusion and the collapse of the organoid structure; or when changing the medium, the temperature was a few degrees lower, which affected the key indicators of the entire differentiation cycle.

The experimental log only stated that "the cell state was average" without any traceable information. Weeks of cultivation results were thus erased in a seemingly minor operational deviation.

These are not extreme cases but the norm in scientific research and high-end manufacturing.

Whether it's silicon wafers or cells, human hands are extremely dexterous but also prone to errors; while the "digital brain" of traditional AI is smart, it can't see the subtle experimental details in the real world.

However, now, the Human-AI Co-Embodied Intelligence system developed by Professor Liu Jia's team at Harvard University has provided a solution to these problems.

This system for the first time deeply integrates human researchers, embodied AI agents, and wearable Mixed Reality/ Augmented Reality (MR/AR) hardware to build an integrated intelligent platform that can simultaneously perceive, reason, and directly participate in real-world experiments and manufacturing processes with humans.

Human-AI co-embodied intelligence paradigm: The APEX system cooperates with humans to complete intelligent manufacturing

Agentic Lab - The human-AI co-embodied intelligence platform enables real-time collaboration between AI and researchers in organoid experiments and production

AI is "smart", and humans are "dexterous", but it's hard to cooperate

Scientific experiments and high-end manufacturing have always been "precise work".

It is necessary to strictly follow the SOP and flexibly adjust according to the real-time situation - this relies on the "feel" and judgment accumulated by human experts over the years.

However, whether in the chip manufacturing workshop or in front of the organoid culture dish, a common problem always exists:

1. The cost of human trial and error is too high, and it takes a long time for novices to grow.

Even experienced researchers may enter the wrong parameters or miss recording steps due to negligence, resulting in the scrapping of the entire batch of samples; novices often need months or even years of apprenticeship inheritance to master the "implicit skills" not written in the SOP.

In life science laboratories, this kind of "implicit experience" is particularly costly - cultivating a batch of cells or organoids often takes weeks to months. A few extra seconds of digestion or a few degrees lower temperature of the reagent may make all the efforts of the entire cycle go in vain.

2. AI is "disconnected" from the physical world.

Current AI models can already write seemingly perfect experimental designs and analyze data charts, but they are still trapped in the digital world -

They can't see the state changes of cells in the culture dish, can't hear the device prompts on the experimental bench, and can't instantly understand human hand operations. Even if the planning is very precise, it is difficult to implement in the real execution link.

In recent years, although "autonomous laboratories" have emerged, limited by the lack of dexterity in the "hands and feet" of robots and insufficient environmental perception, many precise experiments still require human on-the-spot response and judgment.

Thus, an embarrassing contradiction has become increasingly prominent:

  • AI has powerful reasoning and memory but lacks matching "embodied perception";
  • Humans have dexterous hands and intuition but are prone to errors, omissions, and forgetfulness. Each has its own advantages, but there is always an "invisible wall" between them.

Then, is it possible -

· To let humans and AI truly "work in a team"?

· To make AI no longer just a cold algorithm but an "experimental partner" that can share vision, perception, and operation with humans?

· To combine human flexible operations and long-term experience with AI's powerful memory, context reasoning, and tireless focus to achieve "1 + 1 > 2" intelligent co-research?

Human-AI co-embodied intelligence, a closed-loop of AI glasses and multi-agent collaboration

Professor Liu Jia's team at Harvard University is creating a new scientific research paradigm - Human-AI Co-Embodied Intelligence.

In this paradigm, humans and AI are no longer in a one-way relationship of "instruction and execution" but jointly perceive, reason, make decisions, and act through the collaboration of multi-agents.

AI is not just a brain on the screen but a collaborator in the real world.

In micro-nano processing: Let AI understand human actions

In the micro-nano manufacturing workshop, the APEX (Agentic-Physical Experimentation) system developed by Liu Jia's team captures the researcher's hand movements, line of sight, and environmental changes through Mixed Reality (MR) glasses with 8K resolution and an ultra-low latency of 32ms.

These real-time information become the "sensory input" for AI to understand the physical world.

The system is composed of four agents collaborating to complete a closed-loop:

Planning Agent: Break down the research goal into executable processes;

Context Agent: Identify the device status, process parameters, and operator behavior;

Step Tracking Agent: Precisely compare the current operation with the SOP to judge the progress;

Analysis Agent: Integrate video, voice, and device data to provide error correction and guidance.

These four agents are like a well-coordinated scientific research team, enabling AI's reasoning to be aligned with human actions in real-time and achieving "human-machine co-research" in the high-speed and complex processing site.

In actual tests, the accuracy of APEX in tool recognition and step tracking tasks is 24% to 53% higher than that of multi-modal models such as GPT-4o and Gemini, and it is in the leading position in terms of stability and understanding in dynamic physical environments.

In life experiments: Let machines understand the details of "life"

In biological laboratories and production, the Agentic Lab developed by the team allows AI to enter the world of life science through Augmented Reality (AR) glasses.

The system is composed of multiple collaborating agents -

Centered around the virtual PI MolAgent, it connects sub-agent modules such as the knowledge retrieving subagent, multi-scale data analysis subagent, SingleObjectVision subagent, ObserverVision subagent, and Composer subagent.

When the researcher is operating in front of the microscope, the AR glasses capture the first-person perspective. AI can instantly identify the experimental stage, monitor potential deviations, and provide guidance in the form of visual prompts or voice.

It can also automatically analyze microscopic images: With Cellpose-SAM segmentation + VLM text description + CLIP/SigLIP embedding, it constructs an interpretable phenotypic representation - AI can not only "see" the cells but also "understand" their states.

In complex organoid culture, the system can detect tiny morphological heterogeneities, point out possible causes (such as insufficient nutrients in the culture medium or over-digestion), and suggest adjusting the conditions.

Actual tests show that the agreement rate of Agentic Lab's ObserverVision with experts in key frame judgment is 72.3%, and the partial agreement rate is 9.2%. The combined agreement rate exceeds 80%, achieving true "human-machine co-embodied collaboration" in life experiment scenarios.

The APEX human-machine co-embodied interface realizes chip preparation stage recognition, operation error correction, and real-time guidance in the clean room through multi-agent collaboration, supporting the closed-loop collaboration of chip preparation

The Augmented Reality (AR) human-machine co-embodied interface of Agentic Lab realizes experimental stage recognition, operation error correction, and real-time guidance through multi-agent collaboration, supporting the closed-loop collaboration of cell and organoid experiments

Proven in actual tests, three core capabilities subvert the scientific research and manufacturing process

In the clean room of flexible electronics manufacturing, APEX has undergone strict actual combat tests; in the cell and organoid laboratories, Agentic Lab has also faced real-world challenges.

The two systems have proven from different dimensions that when AI and humans collaborate in a co-embodied way, the scientific research process will be completely rewritten.

APEX technology helps engineers check chip preparation parameters and helps novices use instruments

1. Real-time error correction: One-second warning, zero-error operation

Remember Xiao Li at the beginning?

In the RIE reactive etching step, through the real-time monitoring of the MR glasses, APEX popped up a prompt the moment he entered the wrong parameters:

Alarm: The current settings are incorrect... The required parameters are 30 seconds and 50 watts.

Immediate feedback and immediate correction - nip potential losses in the bud before they occur.

In the AR environment of Agentic Lab, when the researcher operates overtime or the liquid volume does not match during the cell passage or digestion stage, the AR glasses will detect the abnormality and issue a reminder at the first time, avoiding irreversible errors such as over-digestion of cells and contamination of the culture medium. It becomes the real "second pair of eyes" on the experimental bench.

Schematic diagram of the continuous learning and knowledge memory evolution of Agentic Lab

2. Automatic recording and analysis: Make every step traceable

In the manufacturing workshop, APEX can automatically record every operation parameter, device reading, environmental snapshot, and timestamp, forming a structured and searchable digital experimental log.

When the researcher asks later:

How long did I set the timer for the last RIE step?

The system can immediately respond:

You set 30 seconds in step 5, and this step was completed at [specific time].

There is no need to rummage through scattered records - the entire scientific research process is traceable.

Moreover, the APEX system has the ability of customized experimental memory and intelligent analysis, which can record, feedback, and optimize operations in real-time. Combining historical data, user preferences, device status, it analyzes experiments and co-creates new experimental plans with researchers.

In Agentic Lab, this concept is also extended to the field of life science: the system not only records experimental actions and parameters but also can automatically generate multi-modal analysis reports.

By analyzing microscopic images, tables, and texts through the VLM model, AI can give real-time evaluations of cell states, judgments of differentiation progress, and optimization suggestions.

Every experiment will be precipitated into a structured "digital experimental memory", which can be searched, reproduced, and even used by AI to reflect on and improve the next experiment.

3. Skill transfer and intelligent co-research: Turn novices into experts in seconds

Under the 3D visual prompts and voice guidance of APEX, novices can smoothly complete the complex eight-step RIE process with only one training.

Their operation level is almost the same as that of experts - the skills that used to take months to master can now be learned in a