Did large models develop a "brain"? Research finds that the middle layers of LLMs spontaneously mimic the evolution of the human brain.
The evolutionary paths of biological intelligence and artificial intelligence are completely different. But do they follow certain common computational principles?
Recently, researchers from institutions such as Imperial College London and Huawei Noah's Ark Lab published a new paper. The study points out that large language models (LLMs) will spontaneously evolve a Synergistic Core structure during the learning process, which is somewhat similar to the biological brain.
Paper title: A Brain-like Synergistic Core in LLMs Drives Behaviour and Learning
Paper address: https://arxiv.org/abs/2601.06851
The research team used the Partial Information Decomposition (PID) framework to conduct in-depth analyses of models such as Gemma, Llama, Qwen, and DeepSeek.
They found that the middle layers of these models exhibit extremely strong synergistic processing capabilities, while the bottom and top layers tend to be more redundant in processing.
Synergy and Redundancy: The Internal Architecture of LLMs
The research team regarded large language models as distributed information processing systems. The core experimental design aimed to quantify the nature of the interactions between the internal components of the models. To achieve this goal, the researchers selected several representative model series, such as Gemma 3, Llama 3, Qwen 3 8B, and DeepSeek V2 Lite Chat, for comparative analysis.
Experimental Methods and Quantitative Indicators
During the experiment, the researchers input cognitive task prompts covering six categories, including grammar correction, logical reasoning, and general knowledge Q&A, to the models.
For each prompt, the model would generate a response of 100 tokens, and the experimental equipment would simultaneously record the activation values of all attention heads or expert modules in each layer.
Specifically, the researchers calculated the L2 norm of these output vectors as the activation intensity data of the unit at a specific time step.
Based on these time-series data, the research team applied the Integrated Information Decomposition (ID) framework.
This framework can decompose the interactions between pairs of attention heads into different atomic terms, such as "persistent synergy" and "persistent redundancy".
By ranking and taking the difference between the synergy values and redundancy values of all pairs of attention heads, the researchers obtained a key indicator: the Synergy-Redundancy Rank. This indicator can clearly show whether the model components tend to perform independent signal aggregation or in-depth cross-unit integration when processing information.
Spatial Distribution Patterns across Models
The experimental data revealed a highly consistent spatial organization pattern across models with different architectures. In the normalized model layer depth diagram, the synergy distribution showed a significant "inverted U-shaped" curve:
Redundant Periphery: The early layers (near the input end) and the late layers (near the output end) of the model showed extremely low synergy ranks, and information processing was mainly in the redundant mode. In the early layers, this reflects the model's basic detokenization and local feature extraction; in the late layers, it corresponds to the token prediction and output formatting processes.
Synergistic Core: The middle layers of the model exhibited extremely high synergy ranks, forming a core processing area. For example, in the heat map analysis of Gemma 3 4B, the attention heads in the middle layers showed dense and strong synergistic interactions, which is the area where the model performs advanced semantic integration and abstract reasoning.
Architectural Differences and Consistency
Notably, the emergence of this "synergistic core" does not depend on a specific technical implementation.
In the DeepSeek V2 Lite model, even when the researchers used "expert modules" instead of "attention heads" as the analysis unit, they still observed the same spatial distribution characteristics.
This cross-architecture convergence indicates that synergistic processing may be a computational necessity for achieving advanced intelligence, rather than just an engineering coincidence.
This organizational pattern precisely maps to the physiological structure of the human brain: the sensory and motor areas of the human brain also show high redundancy, while the association cortex responsible for complex cognitive functions is at the center of the highly synergistic "global workspace".
The Emergence of Intelligence: Driven by Learning, Not Architecture
A key question is whether this structure is inherent in the Transformer architecture or acquired through learning.
By analyzing the training process of the Pythia 1B model, the researchers found that this "inverted U-shaped" synergy distribution did not exist in the randomly initialized network. As the number of training steps increased, this organizational architecture gradually formed and stabilized.
This means that the synergistic core is a landmark product of the capabilities acquired by large models.
In terms of topological properties, the synergistic core has extremely high "global efficiency", which is beneficial for the rapid integration of information; while the redundant periphery shows stronger "modularity", which is suitable for specialized processing. This feature once again shows a precise parallel relationship with the network architecture of the human brain.
Functional Verification of the Synergistic Core
To verify whether the synergistic core really drives the model's behavior, the research team conducted two types of intervention experiments: ablation experiments and fine-tuning experiments.
Ablation Experiments: The research found that ablating the nodes with high synergy would lead to a catastrophic decline in the model's performance and a deviation from its behavior, and the impact was far greater than that of random ablation or ablation of redundant nodes. This proves that the synergistic core is the core driving force of the model's intelligence.
Fine-tuning Experiments: In the scenario of reinforcement learning fine-tuning (RL FT), training only on the synergistic core resulted in significantly better performance improvement than training on the redundant core or a random subset. Interestingly, this difference was not obvious in supervised fine-tuning (SFT). The researchers believe that this reflects the characteristics that RL promotes generalization while SFT tends to focus more on memory.
Conclusion
This research opens up a new path for the interpretability of large models. It shows that we can understand models from a "top-down" information theory perspective, rather than just "bottom-up" searching for specific circuits.
In the field of AI, identifying the synergistic core can help design more efficient compression algorithms or accelerate training through more targeted parameter updates. For neuroscience, this provides a computational verification, indicating that synergistic circuits may play a crucial role in reinforcement learning and knowledge transfer.
Although large models are based on silicon chips and backpropagation algorithms, in the pursuit of intelligence, they seem to have converged to an organizational pattern similar to that of the biological brain. This convergence of intelligent evolution may be the key clue for us to uncover the mystery of general intelligence.
This article is from the WeChat official account "MachineHeart", edited by Panda. It is published by 36Kr with authorization.