Google's AlphaFold won the Nobel Prize, but DeepMind didn't cite any previous papers at all?
AlphaFold Winning the Nobel Prize Sparks Controversy! A research proposed by a doctoral student at NeurIPS in 2016 might be the "prototype" of AlphaFold. Now, the supervisor Daniel Cremers has spoken out, questioning why DeepMind ignored this research and didn't cite it?
AlphaFold has gained great fame for winning the Nobel Prize.
In most cases, the prediction accuracy of AlphaFold 2 is almost comparable to that of X-ray crystallography, which is truly astonishing.
The half-century-old problem in the biochemistry field has finally been solved.
However, in 2016, Dr. Vladimir Golkov proposed at the NeurIPS conference to directly predict protein contact maps from co-evolutionary data using deep neural networks.
In the CASP 11 test, this method outperformed all other existing methods at that time and can be regarded as the "prototype" of AlphaFold.
Recently, Daniel Cremers, the director of the Munich Center for Machine Learning and a professor at the Technical University of Munich, stated that their team laid the foundation for AlphaFold to win the Nobel Prize.
Now, Daniel Cremers asks: Why was this cornerstone in history ignored?
Let's find out.
The Prototype of AlphaFold Actually Appeared in 2016
In December 2018, in the 13th Critical Assessment of protein Structure Prediction (CASP), AlphaFold 1 made a stunning debut and ranked first.
In November 2020, AlphaFold 2 shined at CASP with a median score of 92.4, approaching the full score of 100. On May 8, 2024, AlphaFold 3 was released.
But as early as 2016 at the top AI conference NeurIPS, Vladimir Golkov gave a plenary report on protein prediction.
The methodology proposed at that time included:
For the target amino acid sequence, call the sequence database of known three-dimensional structures
Use the Hidden Markov Model (HMM) for multiple sequence alignment to identify homologous sequences
Calculate the co-evolutionary statistics of mutations
Train a deep neural network to directly predict protein contact maps from raw co-evolutionary data
Systematic evaluation on the CASP11 dataset shows that this method significantly outperforms the optimal technology at that time in both accuracy and speed
This research brought together many pioneers in the fields of deep learning and protein prediction, including collaborators such as Thomas Brox, Alexey Dosovitskiy, and Jens Meiler.
Paper link: https://papers.nips.cc/paper_files/paper/2016/file/2cad8fa47bbef282badbb8de5374b894-Paper.pdf
Interestingly, at the end of the report, Vladimir foresaw that "architecture optimization and scaling will further improve performance" —
This coincides with the subsequent breakthroughs of the AlphaFold team.
As for the reason why it was not cited, there is still no conclusion.
You can watch Vladimir's 20-minute report from that year to learn more about the complete development context of protein prediction:
In 2024, Demis Hassabis (left in the picture below) and John Jumper (right in the picture below) won the Nobel Prize in Chemistry for their contributions to protein structure prediction.
The Nobel Committee introduced the working principle of AlphaFold 2, which is roughly as follows:
Sequence alignment: The system searches the database for proteins similar to the input sequence, and these sequences may come from different species. Through alignment, the program reveals the potential relationships between amino acids. For example, when a mutation occurs at a certain position, it may be related to the change at another position.
Distance map generation: Based on the correlation information in the sequence alignment, the program generates a distance map showing the relative distances of amino acids in space.
Three-dimensional structure prediction: The program converts the distance map into a three-dimensional structure and finally predicts the shape of the protein with high accuracy.
Schematic diagram of the working principle of AlphaFold 2
Daniel Cremers believes that the so - called core technical ideas of AlphaFold were actually fully presented in their 2016 paper.
He feels that the Nobel Committee may have overlooked their foundational work.
In response, Hugo Penedone, a core member of the AlphaFold 1 team, provided some historical details about the early days of AlphaFold.
Did DeepMind's Nobel Prize Really Ignore the Contributions of Predecessors?
Hugo Penedone, a member of the initial AlphaFold 1 team, restored the development timeline of DeepMind.
From July 2015 to August 2019, Hugo Penedone worked at Google DeepMind, engaged in applied research on deep learning and reinforcement learning
According to his recollection, around March 2016, DeepMind launched AlphaFold 1 because, during an internal hackathon, they tried to apply deep reinforcement learning and optimization algorithms to the FoldIt game.
In the following months, they began to explore the possibility of contact map prediction.
Protein contact map of protein VPA0982 from Vibrio parahaemolyticus
Since the concept of contact maps already existed in the early literature, they realized that using neural networks to predict contact maps had higher accuracy than directly predicting the entire protein structure.
Therefore, he believes that DeepMind may have independently proposed this good idea in 2016 as well.
DeepMind's papers were published much later than the relevant research at NeurIPS in 2016. Obviously, they should have cited the achievements of these predecessors in their work!
What Do AI Academic Giants Think?
In response to this incident, Yann LeCun, one of the most famous giants in the contemporary AI world and the soul figure of the Meta AI Lab, also expressed his opinion.
LeCun mentioned that the entire idea of using machine learning for bioinformatics research was born at the Snowbird Workshop in the 1990s (the predecessor of ICLR).
The participants included Anders Krogh (a professor at the University of Copenhagen), Pierre Baldi (a professor at the University of California, Irvine), Richard Durbin (a professor of genetics at the University of Cambridge), David Haussler (the scientific director of the Genomics Institute at the University of California, Santa Cruz), etc.
Before AlphaFold, there were already several research works on protein structure prediction using neural networks.
LeCun said bluntly that he didn't mean to belittle the "contributions of AlphaFold".
Notably, the earliest work in this field was carried out by Pierre Baldi from the University of California, Irvine, one of the participants in the Snowbird Workshop in the 1990s.
He used recurrent networks to predict protein contact maps in 2000.
Paper address: https://pubmed.ncbi.nlm.nih.gov/11120677/ Paper address: https://pubmed.ncbi.nlm.nih.gov/10871264/ Paper address: https://pubmed.ncbi.nlm.nih.gov/10869034/
It was long before deep learning became popular.
LeCun's words are thought - provoking:
Good ideas rarely appear out of thin air. They spread and improve in some way, and sometimes it's even difficult to trace their origins.
LeCun said bluntly that similarly, AlphaFold is an extraordinary achievement with great influence, but it is not an isolated contribution.
Pierre Baldi, the professor from the University of California, Irvine, who was the first to work in this field, also expressed his opinion.
Pierre Baldi said that the first application of deep learning in a certain type of protein structure prediction was in the 1980s.
At that time, it was the work carried out by Qian and Sejnowski on the simpler problem of protein secondary structure prediction.
Paper address: https://pubmed.ncbi.nlm.nih.gov/3172241/
In this regard, the deep learning methods for predicting contact maps and protein structures did precede AlphaFold by twenty years.
Looking back, a careful review of the literature reveals that the deep learning methods for predicting contact maps also played an important role in the development of graph neural networks.
"Long before DeepMind, these methods were also used to learn how to play Go, and DeepMind has never admitted this," Baldi pointed out.
Pierre Baldi said bluntly, "In the long run, science is about truth and beauty. In the short term, it is a rather dirty human affair."