AI "Decodes" Ancient Rome: Unveiling Truth of Millennium - Old Inscriptions, DeepMind's New Model in Nature Again

DeepMind has launched Aeneas AI to assist archaeologists in restoring and deciphering ancient inscriptions.

The lyrics of "Love Before the Western Yuan Dynasty" go: "When the ancient civilization only leaves behind an incomprehensible language, legends become immortal poems." Now, with the generative AI tool Aeneas launched by DeepMind, archaeologists are no longer at a loss when facing ancient inscriptions.

Aeneas was originally a wandering hero in Greek mythology.

Aeneas, which appeared in the main issue of Nature on July 24, is a multimodal generative neural network that can help historians better interpret, attribute, and restore fragmented texts.

Imagine that archaeologists have discovered an inscription engraved with ancient characters in Europe. The text is incomplete, and some characters have been weathered or deliberately damaged.

There is also no contextual information, which makes it almost impossible to restore, date, and locate the origin of this inscription, especially when comparing similar inscriptions.

Considering that in the Roman world, writing was everywhere - from imperial monuments to daily necessities, all were engraved with words. From political graffiti, love poems, and epitaphs to commercial transactions, birthday invitations, and magic spells.

Figure 1 A bronze military order from Sardinia in 113/14 AD, restored by Aeneas, granted by Emperor Trajan to the sailors on a warship

These inscriptions provide modern historians with rich insights, revealing the diversity of daily life in the Roman world.

But they also increase the difficulty of archaeological work. Archaeologists need to rely on their professional knowledge to search their accumulated databases to identify similar texts - these texts are similar in terms of wording, syntax, standardized formulas, or sources.

However, isn't retrieving similar information and determining the context for an article exactly what a generative model is suitable for?

So Aeneas emerged. It can reason across thousands of Latin inscriptions and retrieve texts with similar contexts in just a few seconds. This acceleration frees archaeologists from the complex and time-consuming task of text retrieval.

Now they can quickly obtain interpretations of ancient inscriptions and conduct further research based on the model's findings.

Figure 2 The user interface of Aeneas

The Rich Functions of Aeneas

Before the emergence of Aeneas, in 2022, DeepMind launched Ithaca, a tool based on a deep neural network that predicts the age of ancient Greek inscriptions and fills in missing texts.

Aeneas goes a step further. It can help historians interpret texts by providing context, giving meaning to isolated fragments, thus drawing richer conclusions and integrating a better understanding of ancient history.

Specifically, it searches for parallel texts in a vast collection of Latin inscriptions. By transforming each text into a kind of historical fingerprint, Aeneas can identify the deep connections between texts.

In terms of age and origin prediction, Aeneas can place the text within 13 years of the date range provided by historians and assign the inscriptions to one of the 62 ancient Roman provinces with an accuracy of 72%.

As the first model to use multimodal input to determine the geographical origin of a text, it can analyze both text and visual information, such as images of inscriptions.

Different from Ithaca, which can only predict single words, Aeneas can restore paragraphs with an unknown length of missing text.

Aeneas can restore damaged inscriptions with up to ten missing characters with an accuracy of 73%. When the length of the restoration is unknown, the accuracy is still 58%.

This makes it a more versatile tool for historians dealing with severely damaged materials.

Aeneas is not only suitable for inscriptions but can also adapt to other ancient languages, scripts, and media, from papyrus to coins, expanding its functions to help connect a wider range of historical evidence.

Those who want to try Aeneas can log in to predictingthepast.com for interactive use.

As open-source software, Chinese archaeologists can also adjust Aeneas so that it can be used to interpret lost Chinese inscriptions such as Xixia and Khitan scripts.

Working Principle and Typical Cases

To train Aeneas, DeepMind researchers carefully planned a large and reliable dataset, drawing on the work of historians over the decades to create a dataset that includes texts and images of inscriptions from the ancient Greek and Roman eras.

Aeneas uses the powerful Transformer in the field of NLP to process the input of inscription texts and retrieves similar inscriptions through a decoder, sorting them by relevance.

For each inscription, Aeneas' contextualization mechanism uses a technique called embedding to retrieve a series of similar items - encoding the text and contextual information of each inscription into a historical fingerprint that includes text content, language, time and place of origin, and the relevance to other inscriptions.

Figure 3 The architecture of Aeneas, showing how the model receives text and image inputs to generate predictions of provinces, dates, and restorations

Next, let's look at a typical example of Aeneas parsing an ancient text.

The "Res Gestae Divi Augusti" is a famous stele in ancient Roman history, which is a first-person account of the achievements of the Roman Emperor Augustus. This inscription was written by Augustus himself and is a summary of his self-boasting lifetime achievements.

The text contains exaggerated descriptions of the empire, irrelevant dates, and false geographical markers, and there is also a controversy in the academic community about the time of its writing.

Historians have long debated the age of this inscription. Aeneas conducts a contextual analysis of the ambiguous age and origin characteristics of all inscriptions.

It captures clues in spelling and vocabulary, as well as linguistic nuances that indicate subtle political ideologies and imperial affiliations.

Its prediction is based on the subtle language features and historical markers mentioned in the text, such as official titles and monuments.

By transforming the dating problem into a probability estimate based on language and context data.

Interestingly, Aeneas does not predict a fixed date but produces a detailed distribution of possible dates, as shown in Figure 4.

Its prediction shows two obvious peaks. A smaller peak appears around 10 - 1 BC, and a larger, more confident peak is between 10 - 20 AD.

These results indicate that Aeneas' prediction is cautious, reflecting the differences in the opinions of current scholars.

Providing two possible date ranges instead of a single prediction actually shows that Aeneas provides a new, quantitative method for historical debate.

Figure 4 A histogram of Aeneas' prediction of the dating of the "Res Gestae Divi Augusti". The model simulates the academic debate surrounding the dating of this famous inscription

Recently, there have been many attempts to apply AI technology to the field of archaeology, from facial reconstruction of unknown veterans to the creation of digital avatars of the ancients in museums. The application of AI in the fields of archaeology and history deserves attention.

Last year, Fudan University even offered a course on "AI Archaeology". The Deep Learning and Visual Computing Laboratory (SCUT-DLVCLab) of South China University of Technology also launched a Tonggu large model focusing on the processing of classical Chinese in ancient books.

Facing the vast number of ancient books and steles in China, future archaeologists may need tools like Aeneas more to extract valuable information from massive data.

References

https://deepmind.google/discover/blog/aeneas-transforms-how-historians-connect-the-past/

https://www.nature.com/articles/d41586-025-02335-x

https://blog.google/technology/google-deepmind/aeneas/

This article is from the WeChat public account "New Intelligence Yuan". The author is Peter Dong and Ying Zhi. It is published by 36Kr with authorization.

该文观点仅代表作者本人，36氪平台仅提供信息存储空间服务。

AI "Decodes" Ancient Rome, Revealing the Truth of Millennium-Old Inscriptions. DeepMind's New Model Appears in Nature Again

The Rich Functions of Aeneas

Working Principle and Typical Cases

References