HomeArticle

Does AI trigger a crisis in biomedical research? Top scientists were almost deceived by AI-generated fake papers. National Business Daily exclusively interviews Maxim Topaz, the witness and author of The Lancet article

36氪的朋友们2026-06-17 11:56
AI has spurred a surge in fake citations in biomedical papers, and the industry needs to conduct pre-verification of citations

In May 2026, an article titled "AI Citation Fraud" (correspondence) published in The Lancet became a hot topic in China's medical research circle.

Based on a screening of approximately 2.5 million biomedical papers included in PubMed Central (an online medical literature retrieval system in the United States), this article pointed out that the fraud rate of references in biomedical papers has increased by more than 12 times in the past few years. In 2023, there were about 4 forged references per 10,000 papers, and by early 2026, this number reached 56.9 per 10,000 papers.

Interestingly, Maxim Topaz, the lead researcher of this study, is not only an associate professor at the School of Nursing of Columbia University and a medical AI researcher but also one of the world's top 2% scientists. However, even this expert who has been dealing with AI (artificial intelligence) for a long time was once "fooled" by a false paper generated by AI in his comment writing.

So, what can we do about this? A reporter from National Business Daily (hereinafter referred to as "NBD") interviewed Maxim Topaz on this issue. The following is the transcript of the interview.

Maxim Topaz, an associate professor at the School of Nursing of Columbia University and a medical AI researcher. Photo source: Provided by the interviewee

False citations are widespread in various literatures, and 98.4% of problematic papers have not been corrected or retracted

NBD: What opportunity or experience made you start to pay attention to the issue of citation fraud in biomedical papers?

Maxim Topaz: It all started with a "narrow escape" of my own. At that time, I used an AI chat tool to polish a comment intended for submission to a journal. As a researcher in the field of artificial intelligence, I was aware of the "hallucination" problem of AI. Therefore, I carefully checked all the citations to ensure the accuracy of the content. Even after multiple rounds of revisions and self - checks, the journal editor still questioned one of the references. It turned out that the AI tool had secretly inserted a false paper, and I didn't notice it during my previous checks.

This incident deeply touched me. What's more alarming than the mistake itself is the hidden danger behind it: if even professionals who deal with AI all year round can be deceived, ordinary researchers are naturally more vulnerable. So, I came up with the idea of conducting a research. Previously, no one had ever counted the proportion of false citations that finally made their way into peer - reviewed and officially published literatures. References are the foundation of the entire scientific system. Once the citations lose their credibility, the entire research edifice will be on shaky ground. Our team carried out this research precisely to fill this research gap.

NBD: You hold positions at both the School of Nursing and the Institute of Data Science at Columbia University. What key roles did this interdisciplinary background play in building this automated citation verification system? What was the biggest technical challenge your team faced during the R & D process?

Maxim Topaz: Professional capabilities in both clinical medicine and data science are indispensable. Clinical medical knowledge can help the team determine which problems will have practical impacts and understand the characteristics of regular citations in different sub - fields, so as to distinguish between ordinary citation errors and malicious fraud; data science technology makes large - scale automated verification possible, completely getting rid of the limitations of manual checks.

The biggest technical challenge during the R & D process was the problem of false positives. This time, more than 97 million references needed to be verified. Even if the false positive rate of the system was extremely low, it would generate a huge amount of false warning information. The core challenge we faced was to accurately distinguish between deliberate fraud, innocent typos, and normal formatting issues such as abbreviated titles.

In response, the team built a multi - level verification process, which included an initial screening by a large - language model, and invited independent manual reviewers to verify the results. Eventually, the accuracy of the system reached 91%. Building a reliable and trustworthy verification system with a large amount of data was the most difficult part of the entire project to overcome.

NBD: This verification covered approximately 2.5 million biomedical papers and 125 million references. Why did you choose to conduct such a large - scale analysis? How big is the gap between the industry's previous understanding of citation fraud and the actual situation revealed by your research?

Maxim Topaz: We conducted a large - scale research because the incidence of citation fraud in a single paper is relatively low, and reliable conclusions cannot be drawn from individual cases. We verified a total of 2,471,758 open - access papers and more than 125 million references this time. Only in this way could we count the overall incidence of fraud problems, and more importantly, sort out its long - term change trend.

The industry's previous understanding is very different from the actual situation. Previously, people generally believed that citation fraud was just a problem caused by the misconduct of individual authors or writing negligence. However, the data shows that false citations are now widespread in various biomedical literatures; since 2023, the citation fraud rate has increased by more than 12 times. At the time of this verification, 98.4% of the papers with fraudulent citations had neither been corrected nor retracted. In short, the severity of this problem and the lag in rectification far exceeded the industry's previous predictions.

Quarterly incidence of forged references per 10,000 papers in PubMed Central from January 2023 to February 2026. Photo source: The article "Forged Citations: A Verification Analysis of 2.5 Million Biomedical Papers"

Review papers are the hardest - hit areas for citation fraud, which will mislead doctors and policymakers

NBD: Why did the citation fraud rate start to rise sharply from mid - 2024? In your opinion, is the main cause artificial intelligence, the paper - writing - for - hire industry, or loopholes in the journal review process?

Maxim Topaz: The time point is quite indicative. Large - language models began to be widely popularized from late 2022 to 2023, and it usually takes 100 to 200 days for a biomedical paper to be published after submission. Therefore, papers written with the assistance of artificial intelligence began to appear in large numbers in the database of the National Library of Medicine in the United States from mid - 2024. This is also exactly the turning point when the fraud rate increased sharply.

It should be noted that this study only confirmed the existence of the problem and did not directly define the cause. The prevalence of the paper - writing - for - hire industry and the changes in journal indexing rules and review mechanisms have also pushed up the fraud rate, and various factors are superimposed: precisely because journals lack an effective verification link, false citations generated by artificial intelligence or the paper - writing - for - hire industry can be published smoothly.

Therefore, the problem cannot be attributed to a single cause. Objectively speaking, artificial intelligence makes it easy to fabricate citations, and the current review mechanism was not originally designed to detect this type of fraud.

NBD: Compared with previously artificially fabricated citations, what are the core differences of false citations generated by artificial intelligence? And what broader impacts will they bring?

Maxim Topaz: The most fundamental difference between the two lies in the type of error. In the past, citation problems were mostly due to careless omissions, such as incorrect page numbers or incorrect citations of literature views, but the cited papers themselves actually existed.

Now, the papers corresponding to the citations generated by artificial intelligence are completely non - existent. These false citations have a standard format, bear the names of real and well - known researchers in the industry, are relevant to the paper's theme, and have reasonable publication dates, which are enough to pass the initial check. Regular peer reviews often have difficulty detecting them.

The far - reaching harm is that citations are the core basis for researchers to verify research conclusions, and large - scale fraud has become a reality. The problem has changed from "incorrect citation content" to "the cited paper does not exist at all". This is no longer a decline in the quality of evidence but a direct break in the evidence chain of scientific argumentation.

NBD: During the verification process, what was the most extreme and shocking citation fraud case you found? How did you feel when you saw these cases?

Maxim Topaz: The most typical case was a paper in a certain open - access oncology journal in 2025 that focused on a specific surgical field. Among the 30 references verified in this paper, 18 were fraudulent. These false citations precisely matched the research direction of the paper, the authors were all real experts in the field, and the publication times were concentrated between 2023 and 2024.

There is also a phenomenon that is equally worthy of attention. In 11 papers published by a certain journal within a year, two authors with the same names appeared repeatedly. These papers contained 15 false citations and covered multiple unrelated cutting - edge research fields.

Compared with a single problematic paper, I am more worried about this kind of large - scale fraud. What's even more disturbing is that these problematic papers remain in the public literature database, are still being cited by other papers, but there is no warning or correction, and the industry has not questioned it.

NBD: The citation fraud rate of review papers is 57% higher than that of other types of papers, and reviews are the basis for formulating clinical practice guidelines. Why are review papers particularly vulnerable to AI - driven citation fraud?

Maxim Topaz: Multiple factors have combined to make review papers the hardest - hit areas for fraud. First, the reference lists of reviews are longer, making it easier for false citations to slip through the net; second, writing a review requires sorting out and summarizing a large number of literatures, which is also the part where researchers most often use artificial intelligence for assistance, and this work scenario is precisely prone to generating false citations.

In addition, reviews are at the upstream of the entire research evidence chain: various systematic reviews are written based on reviews, and clinical practice guidelines are based on systematic reviews. Our data shows that the number of citation frauds per 10,000 review papers is 16.7, while that of other types of papers is 10.6. The harm caused by this 57% gap is far greater than the number itself. The fraudulent content in reviews will not stop there but will be transmitted layer by layer, ultimately affecting the core evidence system relied on by clinical doctors and policymakers.

China's medical research has deeply participated in the global academic system. Photo source: Charted by the National Business Daily reporter based on public data

If the industry does not take timely control measures, the pollution of the literature database may be irreversible

NBD: How will false citations mislead clinical decisions and threaten patient safety? Has the medical community underestimated such real - world risks?

Maxim Topaz: False citations will have a negative impact along the complete evidence chain. Clinical practice guidelines are formulated based on systematic reviews. There is already evidence to confirm that some papers written for hire have been included in the systematic reviews used for formulating guidelines. If a guideline cites papers that contain a large number of false citations, the treatment plan it proposes will lose its scientific support.

It should be clear that we did not track the actual treatment results of patients, so we cannot quantify the medical harm directly caused by false citations, nor will we make such hasty judgments. However, there is a structural risk in the existing research evidence system, and this risk has indeed been underestimated by the medical community.

Some systematic reviews have found that about a quarter of the references in medical papers have various errors, which is enough to show that reference verification is not a regular part of peer review. If it is difficult to comprehensively detect ordinary citation errors, it is even more difficult to detect carefully camouflaged AI - generated false citations.

NBD: Your research put forward four improvement suggestions for the industry. In your opinion, which suggestion is the most urgent but also the most difficult to implement? What are the main obstacles?

Maxim Topaz: The most urgent suggestion at present is the first one, that is, journal publishers need to incorporate automated citation verification into the paper submission process before peer review starts. The relevant technology is already mature, and the obstacle to implementation is not a technical problem but a system and cost issue. Publishers need to invest funds and adjust their long - established work processes, which is why this suggestion seems feasible but encounters many difficulties in implementation.

And the most difficult to implement is the retrospective cleaning of published literatures. Screening millions of existing papers one by one and issuing corrections requires a high cost; and no single institution is willing to take full responsibility for this work, and the academic community also lacks the motivation to review and correct the published papers.

In summary, the most urgent thing to do now is to implement pre - submission citation verification; and the most difficult task is to clean up the already polluted existing academic literatures.

NBD: As a scholar who was the first to systematically expose the citation fraud crisis in the biomedical field, what is your biggest concern for the entire industry in the next 3 to 5 years? What action do you call on the global research community, journal publishers, and regulatory agencies to take immediately?

Maxim Topaz: My biggest concern is the formation of a vicious cycle. After a paper with false citations is published, it will be cited by subsequent new papers and even used to train the next - generation artificial intelligence models, thus allowing the fraudulent content to spread and amplify continuously. If not controlled in time, the speed of literature database pollution will far exceed the speed of cleaning and repair.

I call on the global research community, publishers, and regulatory agencies to immediately implement one measure: make automated citation verification a standard process before peer review.

To put it bluntly, the root of the problem is that un - verified AI - generated content has flowed into the permanent academic literature. We are not trying to ban the use of AI tools but to embed the verification link into the entire work process. AI itself is not a hidden danger; the real risk is allowing un - reviewed AI - generated content to enter the academic system openly.

This article is from the WeChat official account "National Business Daily Headlines", author: Lin Zichen, editor: Huang Bowen. Republished with permission from 36Kr.