Good news for the poor. MIT research: No need to stack graphics cards, just copy the top models.
High-scoring models may not understand science; some are just "memorizing by rote"! MIT reveals: the smarter the model, the more similar its understanding of matter becomes. Since the path to truth is clear, why should we get caught up in the expensive computing power race?
Today's AI for Science is like a "multinational summit," where everyone describes the same thing in different languages.
Some let AI read SMILES strings, while others show AI the 3D coordinates of atoms. They are competing on different tracks to see who can make more accurate predictions.
But there is a problem: are these AIs "finding patterns," or do they really understand the underlying physical truth?
In a study by MIT, researchers gathered 59 models with different "backgrounds" and observed whether their hidden-layer representations were the same when understanding matter.
Paper link: https://arxiv.org/abs/2512.03750
The results were very astonishing: although these models look at data in very different ways, as long as they become powerful enough, their understanding of matter becomes extremely similar.
What's even more amazing is that a code model that reads text can actually be highly aligned in "cognition" with a physics model that calculates forces.
They climbed to the top of the same mountain along different paths and began to jointly depict the "ultimate map" of physics and reality.
The Convergence of Truth: Why Do Top Models Become More and More Alike?
To verify whether these models are really approaching the truth, researchers introduced a key indicator: representational alignment.
Put simply, it's about seeing how similar the thinking patterns of two models are when processing the same molecule.
The results showed that the stronger the performance of the model, the closer its thinking style becomes.
In the experiment, as the accuracy of the models in predicting the energy of matter improved, these models spontaneously converged in the representational space towards the same direction.
The Synchronization of Performance and Cognition: The more accurate the energy prediction, the more similar the thinking style of the model is to that of the top base models. Each point represents a model; the size of the point corresponds to the size of the model.
Although the architectures of these AIs vary widely, when processing the same batch of molecular data, the complexity of their feature spaces is compressed to a very narrow range.
No matter how complex the outer shells of the models are, they ultimately capture the most core and concise physical information.
From Complexity to Simplicity: Although AI architectures vary, the physical features they extract are "reaching the same goal by different routes" in terms of mathematical complexity.
This feature is even more obvious in models like Orb V3.
Cross-architecture Representational Alignment: The dark areas in the matrix show a strong resonance between high-performance models like Orb V3 and other rigorous physics models (such as MACE, EqV2).
Through more flexible training, they can align with physical laws more accurately.
This also shows that when an AI is fed enough data and trained in the right way, it can even go beyond existing human formulas and figure out the essential laws of matter on its own.
This convergence phenomenon indicates that AIs are not thinking randomly; they are working together to uncover the unique, real, and objective underlying logic of the material world.
Not Only for Molecules, but Also for "Cats"!
Do you think this "great minds think alike" only happens in scientific AIs? You're completely wrong!
Some researchers compared pure-text language models (such as the GPT series) with pure-image visual models (such as the models behind CLIP or DALL·E). The results showed that their understanding of "cats" is becoming more and more similar!
In language models, the vector representation of "cat" is closely associated with words like "furry," "meowing," "pet," and "catching mice."
In visual models, the vector of "cat" is close to visual features such as whiskers, round eyes, soft fur, and an elegant tail.
Originally, these two models, one looking only at text and the other only at pictures, had no intersection at all.
But as the model size increases and the performance improves, the representations of "cat" in these two completely different modalities get closer and closer in the linear space, as if they are sharing the same "essence of a cat"!
This means that no matter whether an AI starts from text, images, molecular structures, or 3D coordinates, as long as it is powerful enough, it will secretly tend towards the same "internal picture" of reality.
High Scores Are Not the Truth; Beware of "Lost" AIs
If high-performance models are converging at the top of the mountain, what are the remaining models doing?
Researchers found that poorly performing models have two "fates": one is to think independently and drift further and further away on the wrong path; the other is to become collectively stupid, thinking the same but missing key information.
Some models may have good scores, but their thinking styles are very solitary.
For example, MACE-OFF performs strongly in processing certain molecular tasks, but its representational alignment is extremely low and it cannot integrate into mainstream high-performance models at all.
It may have only found some patterns in a specific field. Once it steps out of this comfort zone, it's difficult to transfer its experience to other scientific tasks.
The white dots in the figure represent molecular structures that the model has never seen. It can be seen that when the model processes these structures, the error (MAE) surges, and the representation completely deviates from the normal physical distribution.
When AIs encounter substances that have never appeared in the training data, they often stop thinking, slack off together, or collectively fall into the "comfort zone" left by the algorithm designers, losing the most core chemical features of the substances.
It can be seen that training data is not only the nourishment for models but also the foundation that determines whether a model can touch the truth.
If the data is not diverse enough, even if the model architecture is very sophisticated, it will ultimately just be treading water and unable to evolve into a truly general base model.
Truth Is Unique. How Far Are We from Computing Power Freedom?
Since experiments have proven that different AIs are converging towards the same physical understanding, is it still necessary to stack expensive graphics cards and train a super-large model from scratch?
Obviously, no. And AI has already found a shortcut for us - "model distillation."
Research has found that smaller models can also show amazing potential by imitating the "thinking styles" of those high-performance base models.
We no longer need to blindly pursue the accumulation of the number of parameters. Instead, we can use the characteristic of "truth convergence" to copy the knowledge of large models to lighter and more efficient small models.
The size of the dots in the figure represents the number of model parameters. It can be seen that even smaller models can achieve extremely high accuracy in molecular energy prediction tasks as long as their representations can be aligned with the best-performing models.
This has far-reaching significance for the development of future models.
Orb V3 shows us another solution to the "bitter lesson": through large-scale training and smart regularization methods, a simple architecture can also learn the understanding that only expensive models with imposed physical constraints can have.
Comparison of Multiple Architectures (Partial): The paper evaluated nearly 60 models including Orb, MACE, and DeepSeek, providing a quantitative basis for scientists' choices.
In the future, the criteria for evaluating a scientific AI will become more diverse. We not only look at its current "scores" but also whether it has entered the "convergence circle of truth."
Once we master the logic of this alignment, scientific discovery will no longer be just a computing power race among giants. More lightweight AIs targeting specific scenarios will emerge like mushrooms after rain, truly realizing an innovation explosion under "computing power freedom."
The research by MIT has poured cold water on the fanatical AI race but also pointed out a clear path.
The path to advancement in scientific AI is no longer about more complex architectures or more beautiful physical formulas, but about who can more steadily enter that "convergence circle."
We don't need to end the computing power race because the path to truth is clear - all smart models are heading in the same direction. Therefore, achieving model lightweighting and knowledge transfer through "representational alignment" has become the most practical engineering solution.
The future of science will belong to those who know how to use convergence to reduce costs.
References:
https://the-decoder.com/scientific-ai-models-trained-on-different-data-are-learning-the-same-internal-picture-of-matter-study-finds/
https://arxiv.org/abs/2512.03750
https://www.quantamagazine.org/distinct-ai-models-seem-to-converge-on-how-they-encode-reality-20260107/
This article is from the WeChat official account "New Intelligence Yuan". Author: New Intelligence Yuan. Republished by 36Kr with permission.