Das weltweit erste künstlich intelligente Genom ist entstanden, die 3,5 Milliarden Jahre alten Lebenscodes werden neu programmiert, und die Biologie erlebt ihren „ChatGPT-Moment“.
[Introduction] AI writing the "code of life" has become a reality! Today, Stanford University has joined forces with the Arc Institute to make a significant move. Using the bacteriophage ΦX174 as a template, AI has generated a genome for the first time. Among them, 16 successfully targeted Escherichia coli and can even combat drug-resistant bacteria. This can be regarded as the "ChatGPT moment" in the field of life sciences.
For the first time in human history, a fully functional genome has been generated using AI!
In 1977, biochemist Frederick Sanger and others completed the first genome sequencing in history - that of the bacteriophage ΦX174.
More than 40 years later, today, the team from Stanford University and the Arc Institute, starting with ΦX174, used AI to generate a bacteriophage genome for the first time.
One of the bacteriophage genomes designed by AI looks like this:
Evo-Φ36
Put simply, the bacteriophage ΦX174 is a virus that "infects Escherichia coli". It can precisely target bacteria without harming the human body.
In the past, designing a genome was no easy task. It required considering numerous factors, which limited the progress in the field of synthetic biology.
To address this, the teams from Stanford and others presented their "secret weapon" -
Trained on millions of genomes, the DNA language models Evo 1 and Evo 2 can learn the complex features of genomes on an unimaginable scale.
Its working principle is similar to that of ChatGPT, specifically designed to handle DNA.
Paper link: https://www.biorxiv.org/content/10.1101/2025.09.12.675911v1
They used the bacteriophage ΦX174 as a template and synthesized 285 genomes.
Ultimately, it was shown that 16 genomes can effectively inhibit the growth of the host. They can not only precisely eliminate specific Escherichia coli but also avoid harming other strains.
Some bacteriophages designed by AI have faster replication rates and stronger competitiveness than the original version. They can even combat drug-resistant bacteria that are difficult to deal with using natural bacteriophages.
What does the success of this experiment mean?
It marks a major breakthrough for AI in the field of "synthetic biology" -
For the first time, it has been successfully verified that AI can completely generate a bacteriophage genome with biological functions.
This not only expands the boundaries of human life design but also provides a new alternative therapy for addressing health challenges such as "antibiotic resistance".
For the first time in history! AI generates a "complete" genome
In the latest technical blog post, the core team detailed the secrets behind the successful design of the first batch of AI-generated genomes.
Whether designing a single gene or a complete genome is an extremely challenging task.
Calculated from the history of the genetic information storage system, genomes have existed for approximately 4 billion years, and DNA genomes have been around for about 3.5 billion years.
In February this year, the Arc Institute demonstrated that the Evo "family" of genomic foundation models can successfully generate single proteins or complex multi-component systems, such as the CRISPR-Cas complex.
However, designing an entire genome is an entirely new battlefield!
Because the core challenge in genome design lies in its complexity: multiple genes interact with each other, and a delicate balance must be maintained to ensure replication, host specificity, and evolutionary adaptability.
These challenges do not exist in single protein design.
To overcome this challenge, the Stanford Arc Institute team developed a series of innovative technologies, including:
- A gene annotation process customized for overlapping reading frames;
- A systematic fine-tuning and prompt engineering strategy for sampling from genomic language models;
- A new screening protocol designed for synthesizing bacteriophage genomes
ΦX174, a relay race spanning half a century
If one wants to generate a synthetic genome, a reliable starting point is needed.
The bacteriophage ΦX174 - a tiny viral genome with only 5386 nucleotides encoding 11 genes.
Left: Microscopic image of the ΦX174 bacteriophage; Right: 3D structure of a single ΦX174 bacteriophage
Its size is just within the affordable range of current DNA synthesis costs, yet it is complex enough to test the capabilities of genome design.
However, the overlapping gene structure of ΦX174 creates a rigorous test case:
A single mutation may affect multiple proteins, and it must function properly under multiple constraints.
In addition, ΦX174 encodes various regulatory elements and recognition sequences that work in precise coordination to ensure that the bacteriophage can be correctly packaged and replicated within the host cell.
The ΦX174 genome is a relay race spanning half a century.
In 1977, the research by Fred Sanger and his team made it the first completely sequenced genome in human history.
In 2003, Craig Venter and his team synthesized it completely through chemical methods for the first time, demonstrating that genomes can be constructed from scratch.
Now, in 2025, the team used ΦX174 as a template to create the first batch of AI-generated genomes.
This evolutionary process marks the core capabilities that define modern genomics: first learning to read (sequencing), then write (synthesize), and now design (AI generation).
ΦX174 genome
The AI "genome factory" cracks the overlapping puzzle
As mentioned above, the overlapping genes in ΦX174 render standard tools ineffective because they can only identify 7 out of 11 genes.
To address this, the researchers developed a dedicated annotation process:
By combining open reading frame (ORF) searches with homology comparisons in the bacteriophage protein database, they successfully identified all genes and even predicted some A* genes.
This tool proved extremely useful when evaluating thousands of AI-generated sequences.
The researchers set a baseline - the generated genomes must predict at least 7 proteins that match those of the natural ΦX174 to ensure the retention of the bacteriophage's "survival toolkit".
Fine-tuning Evo to make AI better understand bacteriophages
After being trained on a vast amount of bacteriophage data, the original Evo model can generate sequences but lacks precise control over ΦX174.
Therefore, supervised fine-tuning became the only option.
The team further trained Evo on 14,466 carefully selected tiny bacteriophage sequences. After reducing redundancy, the model focused on ΦX174-related variations.
After fine-tuning, through carefully designed prompts and sampling parameters, Evo can generate sequences that are evolutionarily similar to ΦX174 yet innovative.
This is like giving AI an inspiration template to inject new ideas into the familiar.
Evaluation and screening
After generating the sequences, the authors developed a multi-dimensional evaluation system to check gene arrangement, host specificity, and evolutionary diversity.
The key is to ensure that the AI-generated bacteriophages can infect the non-pathogenic E. coli strain C used in the experiment.
Therefore, they required that the sequences contain spike proteins similar to those of ΦX174 because these proteins determine the host range of ΦX174.
The experiment demonstrated that all 16 functional bacteriophages have strict targeting specificity for E. coli strain C and E. coli strain W.
Moreover, they are ineffective against the other six tested strains.
This precisely proves that host specificity can be maintained while other regions of the genome undergo significant evolution.
A new bacteriophage is born, "wiping out" bacteria in 2 hours
Traditional bacteriophage research is slow and cumbersome, so the researchers innovated the screening process.
They synthesized the genomes using Gibson assembly, transformed them into competent E. coli strain C, and then monitored the growth inhibition in 96-well plates.
Successful infection causes the bacterial density (OD₆₀₀) to plummet within 2 - 3 hours.
This protocol allowed the team to quickly test 285 designs and ultimately verify 16 functional bacteriophages and characterize their adaptability and host range.
Experimental detection for evaluating AI-designed bacteriophages
These AI genomes carry 67 - 392 new mutations compared to their closest natural genomes.
Among them, Evo-Φ2147 carries 392 mutations and has an average nucleotide identity of 93.0% with the bacteriophage NC51.