Zero overhead, eliminate image hallucination, and mine normal sample features based on null space projection.
[Introduction] Currently, large vision-language models (LVLMs) suffer from the problem of object hallucination, which means they generate descriptions of objects that do not exist in the image. A research team from Xi'an Jiaotong University has proposed a method called Nullu. By extracting the "Hallucination Subspace" (HalluSpace) and performing null space projection to edit the model weights, it can effectively eliminate hallucinations without increasing additional inference costs.
Currently, large vision-language models (LVLMs) commonly face the problem of "object hallucination": the models generate descriptions of objects that do not exist in the image out of thin air.
To efficiently eliminate hallucinations, the research team from Xi'an Jiaotong University has proposed an efficient model weight editing method called Nullu (Null space of HalluSpace), which uses the "Hallucination Subspace" (HalluSpace) for null space projection.
Paper link: https://arxiv.org/abs/2412.13817
Code link: https://github.com/Ziwei-Zheng/Nullu
The core idea of this method is: finding the core difference between the representations of normal samples and the features of hallucinated samples in the feature space.
To achieve this goal, based on extracting the internal embedding features of the model for "real description + image" and "hallucinated description + image", the researchers perform principal component analysis on the difference between the two embedding features to locate the key subspace that causes hallucinations, namely HalluSpace.
Experimental results show that HalluSpace contains the overly strong prior preference knowledge of the large language models (LLMs) on which LVLMs are based, which has been proven to be one of the main reasons for hallucinations in previous studies.
Therefore, by orthogonalizing the model weights and projecting the features of the input samples to the null space of HalluSpace, this prior preference can be effectively removed, thereby suppressing the generation of hallucinations.
Nullu is simple to implement, does not require training, is easy to deploy, and does not introduce additional inference overhead. It has achieved excellent results in multiple hallucination elimination tasks, and the results have been published in CVPR 2025.
Weight editing based on null space projection
The weight editing process of Nullu mainly consists of three steps: 1) Construction of real-hallucinated data pairs; 2) Extraction of HalluSpace; 3) Model weight editing based on null space projection.
Data pair construction
For any input data with a "vision-text" structure, the researchers will construct data pairs to extract the hallucination subspace. Each data pair has the same image but different text information: one text contains a real description that accurately describes the objects in the image, serving as a negative sample; the other contains a hallucinated description, serving as a positive sample.
The LURE [1] dataset can be directly used as the data pair. Each pair of samples in it contains an image, its corresponding real description (Ground Truth, GT), and a hallucinated description (Hallucination Description, HD) obtained by keyword replacement.
The process of constructing the LURE data is as follows: 5000 images are randomly selected from the training set of the MSCOCO 2014 dataset, and the corresponding descriptions are obtained as GT.
On this basis, the objects in GT that are most likely to cause hallucinations, such as high-frequency objects, are replaced to form the hallucinated description HD.
HalluSpace extraction
The extraction of HalluSpace will mainly be carried out in the feature space of the MLP layer in the language model part of LVLM. The overall process is shown in the figure.
The large language model part consists of an LLM, and each layer includes a self-attention layer and an MLP layer. To extract the hallucination subspace HalluSpace, first, positive and negative sample pairs with real and hallucinated responses are input respectively. The embedding features are calculated and stored for each layer in the LLM part. Then, the features corresponding to each sample are averaged in the length dimension of the features. These embedding features are stacked into positive and negative sample feature matrices, and the difference matrix is calculated.
Next, principal component analysis is performed on the difference matrix through SVD decomposition.
Finally, the right singular vectors corresponding to the first 𝑘 singular values are selected, that is, the first 𝑘 column vectors of the eigenvector V.
These directions represent the main differences between the real features and the hallucinated features and can be regarded as the directions that cause hallucinated descriptions in the model feature space, namely HalluSpace.
Model weight editing based on null space projection
Since HalluSpace represents the main difference direction between the distributions of real data and hallucinated data, the potential hallucination information in the features can be removed by projecting the model features to the null space of the hallucination space.
Since all inputs share HalluSpace, directly projecting the model weights to the null space of HalluSpace can eliminate the potential risk of hallucinations.
The edited new model parameters can be directly reloaded into the original model. Therefore, no additional computational overhead is introduced during inference.
As shown in the figure below, when there is a component of the internal features of the input content in HalluSpace, the new model parameters can effectively eliminate this component, thereby reducing the occurrence of hallucinations.
Existence and discussion of the hallucination subspace
The researchers further verified the existence of HalluSpace through experiments. At the same time, the paper analyzed and discussed the information contained in HalluSpace through decoding, revealing the relationship between Nullu and existing methods. Further, the paper analyzed the effectiveness of Nullu and revealed its potential relationship with direct preference optimization (DPO).
Analysis of the existence of the hallucination subspace
Assuming that HalluSpace exists, on the test set (not the LURE dataset), the feature difference vectors calculated from real samples and hallucinated samples should have large components in HalluSpace.
To evaluate this, the researchers selected 100 descriptive questions where LLaVA - 1.5 had hallucinations on the CHAIR test as test objects and calculated the difference vectors of the embedding features when hallucinations occurred and when there were no hallucinations for each sample for testing.
In addition, 100 random vectors were randomly selected as a comparison baseline in the experiment.
To avoid the influence of the norm, the researchers further normalized all the vectors.
Figure (a) shows a schematic diagram of the distribution of vectors on the normalized sphere.
For random vectors, they will be randomly distributed inside the unit sphere, so the projection components in the hallucination subspace will be low. If the calculated difference vectors successfully capture the hallucination information, the projection components in the hallucination subspace will be high.
Therefore, the researchers conducted a verification calculation. The results show that the projection components of the calculated difference vectors in the hallucination subspace are more than 10 times higher than those of the random control group.
This evidence indicates that the hallucination subspace captures the feature directions related to hallucinations in the LVLMs features, thus proving the existence of the hallucination subspace.
Other discussions and analyses
By decoding the information contained in HalluSpace, the paper found that it contains a lot of prior preferences of the language model.
Therefore, projecting the model parameters to the orthogonal null space of HalluSpace achieves the function of removing the internal language preferences of the model, thereby effectively solving the problem of object hallucination. This idea is similar to existing methods, such as VCD [2].
In addition, the paper further reveals the potential relationship between Nullu and direct preference optimization (DPO), further demonstrating the effectiveness of the method, which will not be repeated here.
Experiments and analyses
The researchers deployed the proposed method on LLaVA - 1.5, MiniGPT - 4, and mPLUG - Owl2 and verified the effectiveness of Nullu on multiple datasets: including the verification of hallucination performance on CHAIR and POPE, and the general performance test on MME and LLaVA - Bench.
As can be seen from the figure below, since Nullu can directly modify the model parameters through model editing, no additional inference overhead is introduced during the inference process, achieving zero - overhead suppression of object hallucinations.
Compared with other existing decoding enhancement methods, Nullu can achieve better performance in solving object hallucinations at a faster speed. And this performance improvement is not achieved by reducing the output length of the model.
To intuitively show the mitigation effect of object hallucinations, the researchers further showed the test cases of Nullu on the LLaVA - Bench open - generation task dataset.
As shown in the figure below, for the same question, the model edited by Nullu successfully eliminated the object hallucinations in the output of the original model.
The researchers also conducted an online test to detect the effectiveness of Nullu. When the original model output the object hallucination word "person", after using the edited weights for inference, the output changed to "mountain", further proving the effectiveness of the proposed method.
Conclusion
The researchers proposed an object hallucination elimination method called Nullu based on feature editing.
Nullu identifies the hallucination space in the multi - layer perceptron part of each layer of the model. By extracting the low - rank subspace of the difference between real features and hallucinated features and further performing orthogonal projection on the weights of LVLM, the problem of object hallucination is mitigated.
The experimental results show that Nullu can significantly mitigate object hallucinations without increasing additional inference costs, which gives this method an advantage in inference speed compared with current decoding - stage methods and post - processing methods.
At the same time, the model edited by this method can still maintain good performance in the general large vision - language model benchmark test, proving that while improving authenticity, it does not damage the overall ability of the model.
Theoretical analysis shows that this method has an inherent consistency with direct preference optimization in the way of weight update. Through a comparative analysis with previous studies, it is further found that this method effectively reduces the language bias in the large language model by adjusting the model parameters, and language bias has been proven to be one of the key factors leading to object hallucinations.
Author introduction
The authors of the paper are from the Artificial Intelligence Security Laboratory (AI - SEC) team of Xi'an Jiaotong University. The first author, Le Yang, is a specially - appointed researcher at Xi'an Jiaotong University. The co - first author is Ziwei Zheng, a doctoral student at Xi'an Jiaotong University. The corresponding author is Professor Chao Shen from Xi'an Jiaotong University.
References:
[1] Zhou, Yiyang, et al. "Analyzing and mitigating object hallucination in large vision - language models." In ICLR 2024.
[2] Leng, Sicong, et al. "Mitigating object hallucinations in large vision - language models through visual contrastive decoding." In CVPR 2024.
This article is from the WeChat public account "New Intelligence Yuan", editor: LRST , published by 36Kr with authorization.