AI and New Materials: The "GPT Moment" and Paradigm Revolution in Materials Science

With the qualitative change in the ability of large AI models to understand complex structures, generate innovative solutions, and conduct cross-scale reasoning, materials science is undergoing a fundamental transformation from "experience-driven" to "intelligence-driven."

As a disruptive technology resulting from the deep integration of artificial intelligence and material science, AI + new materials are driving a paradigm revolution in material research and development, shifting from the "empirical trial - and - error" approach to "intelligent creation." With the qualitative leap of large AI models in understanding complex structures, generating innovative solutions, and conducting cross - scale reasoning, material science is undergoing a fundamental transformation from being "experience - driven" to "intelligence - driven." The "GPT moment" has arrived in the global scientific research and industrial communities, forming a development pattern driven by data, algorithms, and automated experiments.

Yunxiu Capital systematically sorts out the integrated development path and technological trends of AI and material science, analyzes the competitive advantages in this industrial ecosystem for you, and explores the investment opportunities therein.

In the grand narrative of artificial intelligence, the emergence of generative pre - trained models is undoubtedly a watershed. It not only redefines the boundaries of human - machine interaction but also, with its astonishing versatility and creativity, announces the dawn of the AGI era to the world. When this disruptive technological wave sweeps across the ancient and fundamental field of material science, a "GPT moment" for new materials is approaching.

For a long time, the discovery and research of new materials have been like "searching for a needle in a haystack" in the vast universe, relying on scientists' intuition, experience, and tens of thousands of "trial - and - error" experiments. From Edison testing thousands of substances to find the filament material to modern researchers spending years optimizing an alloy formula, the "alchemy" of material research and development has always been a bottleneck restricting industrial progress. However, with the qualitative leap of large AI models in understanding complex structures, generating innovative solutions, and conducting cross - scale reasoning, material science is undergoing a fundamental transformation from being "experience - driven" to "intelligence - driven." AI is no longer just a tool for auxiliary calculation but has become a "research partner" capable of independently proposing hypotheses, designing experiments, and even discovering new material forms.

The Singularity Has Arrived: AI Reshapes the Paradigm of Material Research and Development, Moving from Empirical Trial - and - Error to Rational Design

The integration of AI and material science has not been achieved overnight. Its development process can be clearly divided into three stages, and each iteration marks a leap in research and development efficiency and cognitive depth.

1.0 Era: The Foundation of Computational Materials Science (Late 20th Century - Around 2010)

The core of this stage is "computational assistance." Computational methods represented by density functional theory (DFT) and molecular dynamics (MD) provide scientists with powerful tools to simulate and predict material properties at the atomic scale. During this period, researchers built a number of high - throughput computational databases such as the Materials Project, laying a valuable "data foundation" for subsequent data - driven research. However, the computational cost of methods like DFT is extremely high, making it difficult to handle material screening tasks in the millions or even tens of millions. Its application is more limited to the mechanism research of known materials and small - scale performance optimization.

2.0 Era: Data - Driven AI Exploration (2010 - 2023)

With the rise of machine learning algorithms and the continuous expansion of material databases, AI + new materials entered the "data - driven" 2.0 era. Traditional machine learning algorithms such as random forests and support vector machines were widely used to establish structure - property relationship models between "composition - process - structure - performance." The breakthrough in this stage is that AI began to learn rules from a large amount of historical experimental data, enabling rapid prediction of material properties and significantly reducing the number of unnecessary experiments. However, limited by data quality, algorithm generalization ability, and the lack of understanding of the internal physical and chemical mechanisms of materials, AI models in this period mainly played the role of "predictors" rather than "creators," and their ability to discover new materials was still limited.

3.0 Era: Intelligent Creation Led by Large Models (2024 - Present)

With the breakthrough in pre - training technology, we have witnessed the rise of "large material models." These models conduct self - supervised learning on a large amount of multi - modal scientific literature, crystal structure databases (such as ICSD, Materials Project), and experimental data, thus mastering the "universal grammar" of the material world.

They are gradually developing three core features similar to GPT:

Emergent Ability: The model can understand cross - domain material knowledge, discover implicit rules that are difficult for human experts to detect, and achieve performance prediction across material systems.

Generative Creation: AI is no longer limited to screening known materials but can, like generating text, "generate" new, theoretically stable crystal structures or molecular formulas according to performance requirements.

Transfer Learning and Physical Enhancement: The general base model pre - trained on a large amount of known material data contains rich chemical and physical prior knowledge. When facing a new system, the model does not need to be trained from scratch. Instead, through transfer learning combined with an active learning strategy, it uses a small amount of high - confidence data (or DFT calculation data) for fine - tuning and boundary correction, thereby significantly reducing experimental costs while ensuring that the prediction results comply with physical and thermodynamic laws.

The arrival of this moment means that material research and development has officially entered a new era of "intelligent generation and precise design."

Value Anchor: Crossing the "Valley of Death" from the "Laboratory" to the "Production Line," Engineering Implementation Is the Hard Truth

According to QYResearch data, the global AI for Science market size was approximately $4.538 billion in 2025 and is expected to reach $26.23 billion by 2032, with a compound annual growth rate of up to 28.9% - this is a huge market, but the potential of AI + empowerment goes far beyond this.

In the six downstream industries of chemicals, pharmaceuticals, new energy, alloys, displays, and semiconductors, the total downstream market size that AI4S can cover is close to $11 trillion. When the R & D penetration rate reaches 2.5%, the annual output value can exceed $140 billion.

The most profound change in the field of AI + new materials is not just about using artificial intelligence to accelerate scientific discovery (i.e., AI for Science, AI4S), but is undergoing a strategic leap from "laboratory intelligence" to "engineering and manufacturing intelligence" (AI for Engineering & Manufacturing).

There is a saying in the industry that "one generation of materials, one generation of industries," which means that the innovation of material technology is the foundation and precursor of industrial upgrading and development. The breakthrough of materials directly determines the technological level and form of the industry.

In other words, if a virtual material with excellent performance in the database cannot be stably and economically mass - produced, its industrial value is out of the question. Therefore, the key indicator for measuring the competitiveness of AI + new material enterprises has shifted from "how many new materials have been discovered" to "how many materials designed by AI have been successfully transformed into mass - producible commodities."

This new paradigm of "implementation is king" requires the AI system to go beyond pure scientific computing and deeply integrate engineering thinking and manufacturing constraints. It is no longer an isolated algorithm model but an intelligent center that runs through the entire chain of "design - experiment - manufacturing."

First, at the design source, AI must have the foresight of "design for manufacturing" (DFM). There are three key points:

Engineering Constraint Pre - placement: From the very beginning of the design, the AI model must take engineering and manufacturing constraints such as raw material cost, synthesis path complexity, equipment compatibility, and environmental safety as part of the optimization goal, rather than considering them afterwards.

Physical Closed - Loop Verification: Every "thought" of AI must be quickly and inexpensively verified in the physical world. The deep coupling with an automated laboratory (dark - lab) to form a "dry - wet combined" iterative flywheel is the key to ensuring the feasibility of the design scheme.

Full - Life - Cycle Perspective: An excellent AI platform should not only design good materials but also be able to predict their long - term stability, recyclability, and environmental impact in end - products, thus providing customers with comprehensive solutions beyond the materials themselves.

In the execution stage, the AI - driven automated experimental platform must evolve from "passive execution" to "autonomous decision - making" and build a high - throughput, high - precision physical verification closed - loop. Three core capabilities are very important:

Unmanned Operation: The AI system needs to coordinate and schedule automated synthesis, characterization, and testing equipment to achieve unmanned operation of the entire process from raw material proportioning, reaction condition control to performance testing. For example, by combining robotic arms and microfluidic technology, hundreds of formulations can be synthesized and screened in parallel within a day, with an efficiency an order of magnitude higher than that of manual operation.

Real - Time Data Feedback and Model Iteration: The massive data generated by experiments (such as temperature, pressure, and spectral signals) need to be transmitted back to the AI model in real - time to drive its dynamic optimization of subsequent experimental schemes. This "dry - wet combined" iterative flywheel can quickly correct the deviation between theoretical prediction and experimental results, forming a closed - loop of "prediction - verification - optimization."

Anomaly Detection and Autonomous Error Correction: AI needs to have the ability to perceive and process experimental anomalies in real - time. When equipment malfunctions or reactions deviate from expectations, the system can automatically trigger emergency plans (such as suspending the reaction and adjusting parameters) and learn from historical data to avoid similar problems, ensuring the continuity and reliability of experiments.

Ultimately, the essence of this change is to upgrade AI from a powerful "research assistant" to the "chief technology officer" driving industrial value. It marks that the development focus of AI + new materials has shifted from exploring the unknown scientific frontier to solving real - world industrial pain points.

Ecosystem Reconstruction: Breaking the "Isolated Island Effect," a Deep - Coupling Battle of Computing Power, Data, and Scenarios

Mass production is by no means a breakthrough in a single technology but a complex systematic project that affects the whole situation. In the traditional scientific research model, "working alone" can no longer meet the high requirements of the AI era - although universities and research institutions master the most cutting - edge algorithm models and theoretical innovations, they are often limited by the lack of real industrial scenarios and pilot - scale verification platforms; traditional material enterprises, although having clear market pain points and rich application scenarios, generally face the dilemmas of weak computing power infrastructure, insufficient accumulation of high - quality data, and a shortage of digital talents.

This mismatch between supply and demand leads to serious resource internal consumption. Only through in - depth "industry - university - research - application" collaboration, closely coupling the computing power of hardware manufacturers, the data in the hands of various parties, the algorithms of technology companies, and the scenarios of industry giants, can we truly bridge the "last mile" from theoretical design to large - scale production.

In the future, AI + new materials will no longer be a simple software purchase but a systematic revolution covering "computing power base, data standards, intelligent algorithms, and physical verification":

At the bottom layer, GPU manufacturers provide the engine, and platform providers set the standards to awaken the dormant data. This layer solves the problems of "computing power bottleneck" and "data silos," and the core lies in connection rather than possession.

GPU companies are the physical engine of the entire ecosystem. AI - based material research and development needs to handle a large amount of quantum mechanics calculations and molecular dynamics simulations, which require extremely high parallel computing capabilities. GPU manufacturers not only provide core accelerator cards but also the underlying parallel computing architecture, which determines the speed and efficiency of upper - layer model training.

Data standard setters (governments, industry associations) are the connectors of the entire ecosystem. The data itself is not in the hands of the platform but is scattered in thousands of professor laboratories and corporate R & D departments. The core value of the platform lies in formulating unified data collection, storage, and interaction standards and building a secure and trustworthy circulation mechanism. Through technologies such as federated learning or data spaces, the data of professors and enterprises can be called and trained by GPU clusters without revealing privacy.

The middle layer is the intelligent hub connecting the bottom - layer infrastructure and the top - layer applications, and its core value lies in building a dual - wheel - driven system of AI algorithms + simulation software to completely solve the pain point of the difficulty in achieving both efficiency and accuracy in material research and development.

The technology/service providers at this layer no longer rely solely on a single technology but deeply integrate the two:

The AI algorithm platform acts as an "accelerator." Using pre - trained large models and generative AI, it can quickly screen candidate materials in a vast chemical space, compressing the original screening process that took months into a few hours and solving the efficiency problem of "searching for a needle in a haystack."

The simulation software provider acts as a "verifier." Introducing physical simulations based on first - principles provides rigorous physical mechanism verification for the prediction results of AI, ensuring that the design scheme not only conforms to data laws but also withstands scientific scrutiny, solving the credibility problem of "black - box prediction."

At the top layer, automated laboratories and industry giants form a "human - machine collaboration" closed - loop, compressing the R & D efficiency from "years" to "days."

This layer is the end - point of value realization. Industry giants such as Wanhua Chemical and Shengquan Group open up real production line requirements and verification scenarios and deeply integrate with automated laboratories. The formulations given by AI are quickly verified in automated laboratories, and the new data generated is fed back to the middle - layer algorithms and bottom - layer platforms, forming a data flywheel.

This full - chain in - depth collaboration marks that industrial competition has shifted from single - point technology competition to ecological alliance confrontation. Only those innovative consortia that can integrate GPU computing power, aggregate scattered data, and anchor industrial scenarios can build an unshakable systematic advantage in the AI + new materials track.

Technological Conflict: The Deep Integration of AI and MGE Is Constrained by the Structural Problem of "Data Scarcity"

With the gradual formation of the ecosystem, the evolution on the technological side is also accelerating. AI is no longer just an auxiliary tool but has undergone a chemical - like deep integration with materials genome engineering (MGE).

The core idea of materials genome engineering (MGE) is to draw on the concept of biological genomics, regarding the microscopic structure of materials (such as atomic arrangement, chemical composition, crystal defects, etc.) as "genes" and their macroscopic properties (such as strength, conductivity, heat resistance, etc.) as "phenotypes." Its goal is to change the passive model of traditional material research and development that relies on the "trial - and - error" method, and to achieve the rational design and efficient development of new materials by building a structure - property relationship database between "composition - process - structure - performance."

Although MGE has laid the foundation for data and high - throughput, it still faces insurmountable challenges in actual implementation, resulting in its potential not being fully realized:

The Dilemma of Abundant Data but Scarce Information: MGE generates a large amount of high - dimensional data such as crystal structures, energy band diagrams, and stress - strain curves, but it is difficult for human scientists to dig out deep - seated, non - linear physical laws from them. For example, the subtle influence of trace element doping on high - temperature creep performance is often hidden in the noise of tens of thousands of data sets, and the human brain cannot effectively identify it.

The Impossible Triangle of Computational Cost and Accuracy: High - throughput computing is fast but has limited accuracy (such as classical force fields); high - accuracy computing (such as DFT) is accurate but has extremely high computational costs, and calculating a complex system may take days or even weeks. Facing a potential material space of hundreds of millions, the screening efficiency is still low only relying on traditional computing methods.

Mainly Forward Screening with Weak Reverse Design Ability: Traditional MGE mainly conducts "screening" in the known or simply combined material space, that is, "I have this structure, calculate its performance." However, for clear performance requirements such as "I need a material that can withstand 2000°C and has a density lower than 3g/cm³," MGE lacks effective reverse generation ability and is difficult to actively create new material structures.

The combination of AI is like installing a "super brain" and an "autopilot system" for MGE.

Value 1: AI as a Law Decoder, Cracking the High - Dimensional Structure - Property Relationship

AI can automatically extract deep - seated physical and chemical features from the high - throughput computing and experimental data of MGE and establish a surrogate model with millisecond - level response. This means that the first - principles calculation that originally took days can now be completed by AI with high accuracy in an instant, greatly reducing the screening cost and enabling researchers to quickly evaluate millions of candidate materials, truly achieving the leap from looking at data to understanding laws.

Value 2: AI as a Reverse Designer, Realizing On - Demand Customization

The introduction of generative AI (such as diffusion models and variational autoencoders) endows MGE with a powerful "reverse design" ability. Now, the R & D logic has been completely reversed: users only need to input target performance indicators (such as "band gap of 1

该文观点仅代表作者本人，36氪平台仅提供信息存储空间服务。