Trotz der Schwierigkeiten bei Meta publiziert Yann LeCun weiterhin rege wissenschaftliche Artikel. Sein neues Werk: JEPA lernt nicht nur Merkmale, sondern kann auch die Daten-Dichte präzise erfassen.
Still publishing papers vigorously despite being "tormented" by Meta's review regulations!
LeCun, who has indicated that he might resign, has brought new research, still collaborating with three colleagues from FAIR.
The new paper by Yann LeCun's team has discovered a hidden skill of the self-supervised model JEPAs (Joint Embedding Prediction Architectures) —
It has learned the "density" of data.
The "data density" here can be understood as the commonness of data: samples with high density are more typical and common data, while those with low density are rare or even abnormal data.
JEPAs was originally regarded as a model only good at feature extraction. This time, LeCun's team found that the model quietly mastered the ability to perceive the commonness of data during the training process.
This means that as long as JEPAs is successfully trained, without doing anything extra, it can be used to judge the commonness of a sample.
This breaks the long - standing academic perception that "JEPAs only learns features and has nothing to do with data density".
Core discovery: Anti-collapse can accurately learn data density
To understand the breakthrough of this new discovery, let's first talk about JEPAs.
From Figure 12 of "A Path Towards Autonomous Machine Intelligence"
As a self-supervised learning framework that LeCun's team has been focusing on in recent years, the core advantage of JEPAs lies in not requiring manual annotation. The model can independently learn feature patterns from massive data. After learning, it can be directly adapted to downstream tasks such as image recognition and cross-modal matching. It is a representative model of efficient learning in the field of AI.
Previously, the academic community generally believed that there were only two core goals in the training of JEPAs:
- One is latent space prediction. That is, after making slight perturbations (cropping, color adjustment) to the original data (such as images), the feature representation of the perturbed data (the data form understood inside the model) can be accurately predicted from the features of the original data;
- The other is anti-collapse. Prevent the features of all samples from converging to the same.
The new discovery in the paper comes from anti-collapse.
If the features of all data are the same, the model is learning in vain. So in the past, people simply regarded anti-collapse as a guarantee to avoid feature failure and did not realize its deeper role.
LeCun's team focused on the hidden value of anti-collapse. Through research and derivation using the variable substitution formula and high-dimensional statistical characteristics, they proved that anti-collapse can not only prevent feature collapse but also enable JEPAs to accurately learn data density.
Theoretically, when JEPAs outputs Gaussian embeddings (features approximately uniformly distributed on the hypersphere in high-dimensional space), the model must perceive data density through the Jacobian matrix (which reflects the model's response to small changes in samples) to meet the constraints during training. This means that learning data density is not accidental but an inevitable result of the JEPAs training process.
To make this hidden density perception ability practical, the team also proposed a key tool, JEPA-SCORE.
This is a quantitative indicator for extracting data density from JEPAs. Its core function is to score the commonness of samples.
According to the formula, the calculation logic is simple and efficient. You only need to obtain the Jacobian matrix when JEPAs processes the target sample, calculate the eigenvalues of the matrix, and then take the logarithm and sum them up. The result is the JEPA-SCORE. The higher the score, the more typical the sample is (high data density), and the lower the score, the rarer or more abnormal the sample is (low data density).
More importantly, JEPA-SCORE also has strong universality and unlimited adaptability. It is neither picky about the dataset nor the JEPAs architecture.
Whether it is ImageNet, handwritten digit MNIST, or unfamiliar data not involved in pre-training (nebula atlas), it can be accurately calculated;
Whether it is I-JEPA, DINOv2 (single-modal visual model), or MetaCLIP (multi-modal model), as long as it is a successfully trained JEPAs family model, it can be directly used without additional model training.
To verify the reliability of this discovery, the team also conducted multiple groups of experiments.
In the ImageNet dataset, the JEPA-SCORE judgments of different JEPAs models on typical samples (such as birds in flight) and rare samples (such as birds in a resting posture) highly overlap, proving that this is a common ability of JEPAs, not an accident of a certain model;
Facing the galaxy image dataset not involved in pre-training, its JEPA-SCORE is significantly lower than that of the ImageNet data, indicating that the model can accurately identify unfamiliar data;
In practical tests of data screening and anomaly detection, the effect of JEPA-SCORE is also better than traditional methods.
Data screening scenario
Anomaly detection scenario
Research team
This research is not the work of LeCun alone.
The other three core researchers are also researchers at Meta FAIR.
Randall Balestriero is an assistant professor of computer science at Brown University and has been deeply involved in the fields of artificial intelligence and deep learning for a long time.
He has been researching learnable signal processing since 2013, and the technology he participated in was used for Mars earthquake detection by NASA's Mars rover.
He obtained his doctorate from Rice University in 2021 and then entered Meta AI as a postdoctoral fellow, studying under Yann LeCun.
Nicolas Ballas holds a doctorate from the University of Grenoble, France.
From April to September 2010, he served as a R & D intern at LTU Technologies, working on large-scale clustering related to image retrieval.
Since 2017, he has been working as a research scientist at FAIR for more than 8 years.
Michael Rabbat is a founding member of FAIR. He holds a bachelor's degree in engineering from the University of Illinois at Urbana - Champaign, a master's degree in engineering from Rice University, and a doctorate in electrical engineering from the University of Wisconsin - Madison.
His research focuses on three major fields: optimization algorithms, distributed algorithms, and signal processing.
Before joining Meta, Mike was a professor in the Department of Electrical and Computer Engineering at McGill University.
Paper address:
https://arxiv.org/abs/2510.05949
This article is from the WeChat official account "QbitAI". Author: Wen Le. Republished by 36Kr with permission.