微软GigaTIME登上《Cell》:5美元切片变成免疫图谱
Microsoft has published its latest achievement in Cell: GigaTIME can translate an H&E slide into a previously scarce immune map and reconstruct the TIME (Tumor Immune Microenvironment) at the population scale. Many old limitations in cancer immunology research are thus starting to loosen.
It's hard to imagine that a $5 H&E slide from a hospital would one day be featured in Cell, and even prompt Microsoft's CEO to personally share it.
However, this $5 pathological image has been completely transformed by AI.
AI has translated it into an expensive, scarce mIF (Multiplex Immunofluorescence) immune image that was previously impossible to scale up.
After this translation was applied to 14,256 patients, generating nearly 300,000 virtual immune maps, a never-before-seen "virtual population" was illuminated.
This means that for the first time, cancer immunology research has overcome the pain of limited samples and has the opportunity to answer questions that were previously unanswerable.
It also means that the scale of medicine is quietly shifting.
The Counterattack of the $5 Slide
While the outside world equates cancer immunology research with "expensive" and "scarce", GigaTIME has opened up a brand-new path with the simplest and cheapest slides.
Traditional multiplex immunofluorescence (mIF) costs thousands of dollars per slide, takes a long time, and has few samples.
Even top laboratories can only cover a tiny fraction of the overall samples in a year. This means that traditional mIF cannot be widely used.
However, the H&E stained slides produced by hospitals every day only cost $5 - $10 each.
For decades, it has only been regarded as a "routine diagnostic tool", and no one would associate it with a "high-dimensional immune map".
GigaTIME has changed this. Through cross-modal learning, it translates the morphological features in H&E into 21 protein channels in mIF, replicating the originally expensive, scarce, and unscalable immune information onto every ordinary slide.
A - B show that GigaTIME significantly outperforms CycleGAN in both structural consistency and signal consistency; C shows the strong correlation between virtual mIF and real mIF (taking DAPI, CK, CD68, CD4 as examples).
This is not a small trick, but a "structural translation": the signals hidden in the nucleus, cytoplasm, and structural textures are reconstructed into the real expression in the immune space.
This is why Microsoft's CEO personally emphasized on X:
AI is making the immune information that was previously invisible to us accessible.
It proves to the world that the most expensive part of cancer research can be "translated" from the cheapest part.
And when this translation is scaled up, a brand-new research world begins to unfold.
After Analyzing 14,256 Cancer Patients, Opening the Door to a New World
After GigaTIME translated H&E into mIF, it discovered an unprecedented research window.
In the past, limited by cost and sample size, the tumor immune microenvironment (TIME) could only be observed in dozens or at most hundreds of cases.
This time, the research team applied the model to 14,256 cancer patients, spanning 24 types of cancer and 306 subtypes, and finally generated 299,376 virtual mIF images.
These cases come from the real-world clinical medical system of Providence, spanning 51 hospitals and thousands of clinics.
This allows GigaTIME's training and validation to be rooted in the real world rather than the sterile environment of a laboratory.
The overall research framework of GigaTIME. It shows the real-world data of over 14,000 patients, the translation process from H&E to virtual mIF, and three downstream tasks (biomarker association, patient stratification, and TCGA validation).
The data accumulated over ten years was surpassed at once.
The first achievement brought by this virtual population is a large-scale biomarker association map.
The researchers identified 1,234 statistically significant protein - biomarker associations in it.
The TIME immune spectrum map across cancer types. It shows the differences in immune activation of different cancer types in 21 protein channels, covering functional categories such as proliferation, immune checkpoint, and epithelial - mesenchymal transition.
It includes patterns supported by existing literature, such as high MSI and high TMB usually accompanied by an increase in TIME - related channels; there are also some new cross - cancer associations, such as the immune links with driver mutations like KRAS and KMT2D.
More importantly, this virtual population is not developed in isolation.
The research team compared the virtual mIF generated by GigaTIME with the data of 10,200 patients from TCGA and obtained a cross - dataset consistency of r = 0.88.
The consistency of virtual mIF between Providence and TCGA.
This means that regardless of the differences in population distribution, cancer type composition, or tissue source, GigaTIME's immune translation remains highly robust.
Microsoft Research has defined this work as the world's first population - scale TIME research based on spatial proteomics.
In the past, due to the scarcity of mIF, most of these analyses only existed theoretically. Now, GigaTIME has presented the facts to us.
Can Immunity Predict Diseases? The Virtual Population Provides the Answer
The next step is to verify: Can the immune information translated by AI be used to judge diseases? Can it be used for clinical guidance?
The answer is bolder than expected.
The research team conducted an association analysis on nearly 300,000 virtual mIFs and found 1,234 statistically significant protein - biomarker relationships.
These relationships span three levels: across cancer types, within cancer types, and within subtypes.
It includes patterns verified by literature, such as MSI - H/TMB - H usually accompanied by a general up - regulation of immune - related channels like CD138 and CD4;
There are also new population characteristics that could not be observed due to insufficient sample size in the past, such as the global association between driver mutations like KRAS and KMT2D and immune activation.
The association matrix of virtual mIF × biomarkers.
This is the first time we have seen the causal texture of cancer immunity at the real - world population scale.
The research team further asked: If we combine the 21 virtual mIF channels into an overall feature, can it be used to distinguish the survival risk of patients?
The answer is yes.
The survival stratification ability of virtual mIF. A - C show the correlation between virtual mIF and pathological stage; D - F show the survival stratification performance of CD3, CD8, and GigaTIME signature in pan - cancer, lung cancer, and brain cancer; G gives the importance ranking of different protein channels for survival prediction.
- At the pan - cancer level, the GigaTIME signature can clearly distinguish the survival curves;
- In lung cancer and brain cancer, it also shows stable stratification ability;
- The prediction effects of virtual CD3 and virtual CD8 are highly consistent with the performance of real CD3/CD8 in the literature.
- The signature combining 21 channels has better performance.
The immune map translated by AI is not only "real - like" but also "usable like the real one".
The real difficulty of the tumor immune microenvironment lies in that it is a complex "spatial structure problem".
The spatial activation map of virtual mIF.
In the past, the "conjunction and union of immune patterns" could only be speculated by hypothesis. Now, it can be directly verified in the virtual population.
GigaTIME enables the "geometry" of the tumor immune microenvironment to be systematically analyzed for the first time.
GigaTIME Learns Not Only Skills but Also the Language Itself
What really makes people believe in it is never how accurately it predicts, but "how did it learn these?"
The credibility of GigaTIME was actually determined at the moment of its birth.
mIF is very expensive, costing thousands of dollars per slide. Doing research with it is like burning money.
H&E is the exact opposite. It costs only $5 - $10 per slide and is used for generation, scanning, and archiving all over the world.
GigaTIME connects these two worlds.
It learned the language relationship between them from 40 million cell - level, one - to - one corresponding H&E and mIFs.
More importantly, when the model is applied to a population from a different hospital system and different sample sources, it still behaves stably.
Taking a step forward, it is easy to find that GigaTIME stands on the shoulders of giants.
Microsoft's previous GigaPath has proven that H&E contains extremely rich structural signals.
GigaTIME just further translates this structure into immunity, presenting what could only be seen in experiments in front of us.
That's why the previous large - scale associations, survival stratifications, and spatial structure discoveries are not illusions of the model.
They are credible because the model has been based on real - world control data from the very beginning and has been continuously verified in independent populations.
You will feel that it doesn't just "guess right", but really "understands".
Starting from an H&E slide, GigaTIME has opened a door that no one has ever thought of:
learning the language of patients.
It means that medicine finally has the opportunity to extend what can be seen into a language that can truly understand patients.
It is not the end but more like the first cornerstone of the virtual patient era.
When the immune map can be reconstructed on a scale of hundreds of thousands of people, future predictions of diseases and deductions of treatment responses may not be limited to speculation.
Now, GigaTIME has been fully open - sourced on Foundry Labs and HuggingFace, which means that its advantages are no longer exclusive to a few teams but will become a basic ability that the entire medical community can continue to build on.
A new story may start here.
References:
https://x.com/satyanadella/status/1998424249611211263
https://www.microsoft.com/en-us/research/blog/gigatime-scaling-tumor-microenvironment-modeling-using-virtual-population-generated-by-multimodal-ai/
https://www.cell.com/cell/fulltext/S0092-8674(25)01312-1
This article is