Reac-Discovery Solves Universality Problem of Self-driven Lab Systems with 91% Accuracy via Math Modeling, ML & Automated Experiments

AI self-driven laboratory accelerates new chemical discoveries

The self-driving laboratory system has improved the speed and accuracy of chemical reactor design, but existing research lacks a unified model for geometric parameters. To address the issue of universality among different systems, a research team from the IMDEA Materials Institute in Spain has launched the Reac-Discovery semi-autonomous digital platform. Based on the periodically open cellular (POC) structure, it offers an innovative solution for advanced catalytic reactors.

In the past, in reactor engineering, 3D printing technology could precisely fabricate "periodically open cellular (POC) structures" with regular holes. Constructing "grid-like" reactors allows gases, liquids, and heat to flow smoothly, creating the possibility of improving reaction efficiency. The integration of artificial intelligence further endows the laboratory with "self-regulation" capabilities: the automated platform can monitor temperature, flow rate, and reaction progress in real-time and autonomously optimize experimental protocols based on the data. These systems, known as Self-Driving Laboratories (SDL), are bringing unprecedented precision and speed to reactor design.

However, despite the breakthroughs brought by the combination of digital and automated experiments, existing research still lacks a unified model for geometric parameters such as porosity, surface area, and tortuosity. Traditional methods such as computational fluid dynamics (CFD) simulations not only have limitations such as low efficiency and high computational costs, but also the design of structured reactors often relies on manual experience and specialized software. There is a lack of a generalizable unified framework, resulting in limited reusability and universality among different systems.

To address the limitations of traditional methods, the research team from the IMDEA Materials Institute in Spain has launched the Reac-Discovery semi-autonomous digital platform. Based on the periodically open cellular (POC) structure, it adopts a closed-loop system integrating design, manufacturing, and optimization modules. It can conduct multi-reactor evaluations in parallel and has the functions of real-time nuclear magnetic resonance (NMR) monitoring, machine learning (ML) optimization of process parameters, and topological descriptors. While improving performance, reaction efficiency, and reducing material consumption, it enhances the system's universality.

The relevant research results were published in Nature Communications under the title "Reac-Discovery: an artificial intelligence–driven platform for continuous-flow catalytic reactor discovery and optimization".

Research Highlights:

* Combine mathematical modeling, machine learning, and automated experimental systems to achieve full-process integration of catalytic reactors from geometric design, 3D printing manufacturing to experimental optimization;

* Incorporate topological parameters into the optimization space, breaking through the limitations of traditional methods that only regulate single variables such as temperature and flow rate, and achieving simultaneous optimization of geometric structure and process conditions;

* Construct a performance prediction model based on neural networks, develop a machine learning-driven algorithm system, and achieve rapid evaluation and iteration of reactor performance through the performance model, significantly improving experimental efficiency and resource utilization.

Paper Link: https://go.hyper.ai/ueB79

Autonomous Generation of Datasets to Support Closed-Loop Optimization

This research did not use external public datasets. Relying on the Reac-Discovery platform, the research team autonomously generated a multi-dimensional internal data system covering geometric structure, printability, and reaction performance during the experiment. According to the three functional modules of the platform, Reac-Gen, Reac-Fab, and Reac-Eval, the dataset generated in this research is divided into three parts:

* Structural parameterization dataset: Reac-Gen uses a mathematical parameterization model to generate periodically open cellular (POC) structures, controls the structure morphology through parameters such as size, threshold, and resolution, and provides quantitative input for topological optimization;

* Printability dataset: Generated after Reac-Fab establishes the corresponding relationship between structural parameters and printing results;

* Reaction performance dataset: Formed by Reac-Eval when conducting parallel experiments based on the Self-Driving Laboratory (SDL), recording data such as temperature, flow rate, concentration, and yield in real-time.

Currently, all data generated by the closed-loop framework from structure generation to performance verification have been uploaded to Zenodo.

Dataset Link: https://hyper.ai/datasets/45520

Reac-Discovery: Integration of Three Modules to Achieve a Closed-Loop of the Integrated Process

The overall architecture of Reac-Discovery is centered around machine learning (ML), forming a closed-loop of the integrated process of "generation - manufacturing - evaluation - optimization" based on data feedback. Among them, the closed-loop platform is mainly divided into three modules: Reac-Gen, Reac-Fab, and Reac-Eval, and the functions of each module are interrelated during operation:

* Reac-Gen: Parametrically generates and geometrically analyzes the periodically open cellular (POC) structure, and provides feedback through machine learning (ML);

* Reac-Fab: Verifies the printability of the reactor through a high-resolution 3D printing algorithm, manufactures it, and then performs catalytic functionalization;

* Reac-Eval: Utilizes machine learning and real-time nuclear magnetic resonance (NMR) monitoring data analysis, and uses an artificial neural network (ANN) to simultaneously optimize the process and geometry; the experimental results are then fed back to the core machine learning model to promote the self-learning and self-iteration cycle of the reactor improvement.

Overall architecture of the closed-loop platform Reac-Discovery

Reac-Gen: Geometric Modeling and Parametric Design

The Reac-Gen module is the starting module of the Reac-Discovery system, responsible for the geometric design and parametric modeling of the reactor. This module generates periodically open cellular structures based on a set of predefined mathematical equations including Gyroid, Schwarz, Schoen-G, etc., and generates diverse geometric topologies on the "macro + micro" scale by adjusting three main parameters: size (S), horizontal threshold (L), and resolution (R). Among the three modules, Reac-Gen is mainly responsible for digital modeling and structural quantification, and its workflow is divided into the following main steps:

* First, input the key geometric parameters of the structure. The system establishes a model in the three-dimensional scalar field based on the predefined mathematical equations, and generates an implicit surface through isosurface calculation to determine the overall shape and internal topological structure of the reactor;

* Project the equations into three-dimensional space, perform meshing, scale adjustment, and cylindrical cropping based on the algorithm to make the structure fit the reactor shape and obtain a high-fidelity three-dimensional structure; at the same time, automatically correct the boundary smoothness and pore continuity to ensure that the structure has reasonable physical connectivity and stability in both printing and fluid simulation;

* Generate manufacturing and data analysis files, and input them into the next module, Reac-Fab, to provide core input for subsequent printability prediction, 3D manufacturing, and performance data analysis.

In the data output step, the data files output by Reac-Gen are divided into two categories:

* STL files: Used for three-dimensional printing manufacturing;

* Structural feature files (XLSX): Record geometric descriptors such as surface area, porosity, tortuosity, and hydraulic diameter.

Workflow diagram of the Reac-Gen and Reac-Fab modules

Reac-Fab: From Feasibility Verification to Sample Printing

The Reac-Fab module is mainly responsible for the physical manufacturing of the reactor, using high-resolution stereolithography (SLA) 3D printing technology to construct the structure. The workflow of this module is divided into two steps:

* Receive the STL and structural data output by Reac-Gen, use a machine learning model to predict the printability of the structure, and perform printing settings and equipment calibration;

* Print the structure through high-resolution SLA technology, use the optimized material formula and parameters, and perform functionalization treatments such as surface chemical modification and immobilization of catalytic active components on the printed sample to obtain the sample.

Among them, this module uses an algorithm based on a neural network classification model in the printability verification step. This algorithm uses a total of 236 experimental samples as training data. The model determines whether the structure is printable by comparing the theoretical weights and experimental weights of key geometric descriptors. From the experimental data, the prediction accuracy of this method is as high as 91%, which can effectively improve manufacturing efficiency and reduce experimental costs. At the same time, this module can run without a large number of pre-experiments, improving the applicability and scalability of the algorithm in different printing systems (such as FDM printing using PLA).

Reac-Eval: Experimental Verification and Dual Optimization

Reac-Eval is the core module for experimental verification and optimization in the Reac-Discovery platform. This module integrates the ability to simultaneously evaluate multiple structured catalytic reactors designed by Reac-Gen and printed by Reac-Fab, enabling real-time monitoring and automatic control of multiphase reactions. Among them, all hardware is integrated through a unified Python-based interface to ensure seamless connection between experimental data, prediction models, and control systems. The workflow of Reac-Eval mainly covers five steps:

* Define boundary conditions such as gas-liquid flow rate, temperature, concentration, and topological descriptor range, generate randomized experimental combinations to cover the parameters, and complete the initialization and condition setting of the experiment;

* Run multiple structured reactors in parallel on the self-driving platform, monitor the reaction progress in real-time through a benchtop NMR, and collect performance data;

* Optimize process variables and process parameters based on the machine learning and neural network model M1, and retrain the optimization results that do not meet expectations based on the initial dataset;

* Optimize the geometric parameters of the reactor based on the neural network model M2;

* Generate an optimized reactor design based on the prediction results of M2, conduct a secondary experimental verification, and return the data that does not meet expectations to the model for further training.

Overall, while achieving dual optimization of process parameters and geometric topology, the Reac-Eval module constructs an automated cycle of experiment, modeling, and feedback based on the self-driving platform.

Workflow of the Reac-Eval module

Double Verification of the Application Effect of Reac-Discovery

To verify the actual effectiveness of Reac-Discovery in multi-scale coupling and machine learning-driven optimization, the research team selected two typical multiphase catalytic reactions, acetophenone hydrogenation and CO₂ cycloaddition, as test scenarios. Involving gas-liquid-solid three-phase conversion, the mild hydrogenation conversion of acetophenone hydrogenation and the high-complexity thermodynamic reaction of CO₂ cycloaddition provide conditions for verifying the robustness, stability, and reproducibility of the system in self-optimization and topological reconstruction.

Verification of Acetophenone Hydrogenation Reaction

In the verification experiment of the acetophenone hydrogenation reaction, the research team selected the acetophenone hydrogenation reaction as the test object. Using immobilized palladium nanoparticles (PdNPs) as the catalyst, they evaluated the optimization ability of Reac-Discovery in complex multiphase catalytic reactions through a two-stage optimization path:

* First optimization stage (G1): Generate 9 groups of Gyroid geometric structures based on Reac-Gen to construct reactors with significant differences in porosity and surface area; the Reac-Eval module conducts 60 groups of hydrogenation experiments, monitors the reaction process in real-time through nuclear magnetic resonance, and collects data for training the correlation modeling of M1;

* Second optimization stage (G2): Based on the M2 model, incorporate structural descriptors into the learning process to achieve joint optimization of structure and performance.

The experimental data shows that the prediction of the M1 model is highly consistent with the experimental results, which can identify the optimal process interval among more than one million parameter combinations, significantly reducing the experimental exploration cost. In addition, in the G2 stage, the prediction accuracy of the M2 model is further improved, and it can identify the best geometric shape by screening and comparing 480 printable POC structures, verifying the high precision and robustness of the Reac-Discovery platform in multi-variable optimization and structure-function prediction.

Optimization stage G1 of the acetophenone hydrogenation reaction experiment

Optimization stage G2 of the acetophenone hydrogenation reaction experiment

CO₂ Cycloaddition Reaction

To further verify the adaptability of the platform in complex multiphase systems, the research team conducted a verification experiment using the CO₂ cycloaddition reaction:

* First stage (G1): Based on the Reac-Eval module, complete 60 groups of conditional experiments on the self-driving experimental platform, use nuclear magnetic resonance to monitor in real-time to generate the initial dataset, and use the neural network model M1 to predict the yield and screen out the theoretically optimal conditions;

* Second stage (G2): Based on the M2 model, integrate geometric descriptors and process parameters, optimize the reactor topology and reaction conditions, and determine the geometric optimal solution by comparing printable POC structures.

The experimental results show that the theoretically optimal conditions screened out by the experiment are completely consistent with the predicted values, refreshing the current performance upper limit of the three-phase immobilized reactor. In addition, the Reac-Discovery reactor always maintains a high conversion rate of 40% to 90% in four different epoxide systems, verifying the cross-system generalization ability and stability of Reac-Discovery.

该文观点仅代表作者本人，36氪平台仅提供信息存储空间服务。

With an accuracy rate of 91%, Reac-Discovery integrates mathematical modeling, machine learning, and automated experiments to solve the problem of the universality of self-driven laboratory systems.