HomeArticle

The female lead of "Resident Evil" has crossed over into AI and achieved SOTA, creating a free "AI memory system" that has gone viral on GitHub.

量子位2026-04-09 15:27
Note that it still runs locally.

I'm so shocked! I found a project related to the heroine of "Resident Evil" on GitHub!

Milla Jovovich, the well - known actress who played Alice, recently collaborated with an old programmer friend and Claude to create an AI Memory System.

After months of refinement, this system, once unveiled, achieved an astonishing "the highest publicly verifiable score in history" of 96.6% in the Long - term Memory Benchmark Test LongMemEval.

Moreover, it is open - sourced upon release, and everyone can use it for free (note that it can run locally).

Of course, more interesting than the results is the design concept behind the project:

Just like its code name MemPalace, this project takes inspiration from the "Memory Palace Method" commonly used by ancient Greek orators, enabling AI to organize memories through "spatial locations" —

The Palace is a large room containing all knowledge. Different knowledge is placed in different Rooms according to themes, and each room stores specific Memories.

Retrieving something is like walking through a room and opening doors one by one.

Thanks to this memory structure, the retrieval efficiency of MemPalace has increased by approximately 34% compared to global random searches.

More importantly, to solve the problem of AI memory, people used to focus on "letting AI decide what is more worthy of being remembered," but now it's no longer necessary.

With MemPalace, every conversation you have with AI can be remembered, even those spanning several months (Is this another form of "Black Mirror" coming true?).

Even more interesting is that they specifically created an abbreviation language AAAK for AI.

Using the "Memory Palace" Increases Retrieval Efficiency by 34%

Let's look at more information about MemPalace.

In the description of another author, Ben Sigman, compared with existing AI memory systems on the market, MemPalace's uniqueness lies in:

Firstly, it has the best results; secondly, its working mode is completely different.

In addition to achieving the highest score in history in LongMemEval (RAW mode), it also achieved excellent scores of 92.9% and 100% in ConvoMem (focusing on short - term memory tests) and LoCoMo (focusing on ultra - long - term memory tests over several months) respectively.

Some of the evaluation results presented by Ben are as follows (anyone can conduct tests using the scripts provided in the repository):

It should be noted that MemPalace does not send user data to the cloud. All memory processing is completed locally.

This also fundamentally reduces the risk of privacy leakage — because neither the recording and structured organization of conversation content nor subsequent retrieval and invocation rely on remote servers.

Not only is it localized, but MemPalace also adopts a human - like memory model, the "Memory Palace Method".

Different from the common vector database solutions, MemPalace does not simply slice conversations, perform Embedding, and then do similarity recall.

Its core lies in transforming memories into a navigable spatial structure.

Here, Wing represents a person or a project, and each extended wing is equivalent to an independent space.

There are many Rooms in this independent space. Each Room represents a specific theme (such as authentication, billing, deployment, etc.), and all information will be classified into different rooms according to themes.

Connecting different rooms are Halls, which mainly define "which category this memory belongs to" (such as suggestions, personal preferences, decisions, etc.), adding various attributes to the information beyond the theme.

The specific content is stored in two layers:

Drawers store the original records, with all conversations intact and completely sealed;

Closets store compressed summaries of this content, prepared for AI to read quickly.

When the same rooms appear in different wings, MemPalace will automatically create Tunnels to connect the same themes scattered among different people and different projects.

Ultimately, under this structure, all content can be searched for by path in the memory palace.

Here, the author conducted two experiments to verify two things:

First: Why does this structure increase retrieval efficiency by 34%?

Second: How does the memory stack work? (After all, not all memories need to be fully loaded every time).

Regarding the effectiveness of the structure, the author directly compared the effects of four retrieval methods in more than 22,000 real conversations:

Global random search, first limiting to a certain wing, then adding a layer of halls, and then precisely targeting a room.

The results showed that each additional layer of structure is equivalent to reducing the search space and enhancing semantic constraints, so the effect gets better and better.

In the author's view, the palace structure itself is the product.

As for efficiency issues such as when to search and how much to search, the author designed a memory stack with importance levels ranging from light to heavy.

Among them, L0 + L1 are "permanent players", totaling about 170 tokens, and will be always loaded in every conversation, enabling the AI to have the most basic self - awareness and user context from the moment it "wakes up".

L2 and L3 are triggered on demand. The former belongs to room - level recall, and the latter belongs to global in - depth search.

The overall logic is to understand you with the minimum cost first; if it's not enough, supplement locally; if still not enough, conduct a global search.

The benefits of this design are also straightforward. Not only is the single - time recall more accurate, but long - term memory is also more stable.

When it comes to long - term memory, the author directly fed MemPalace with six months' worth of conversation content, approximately 19.5 million tokens.

A simple conversion shows that this is equivalent to 200 - 400 books or a medium - sized code library with 10 - 30 projects.

With such a large volume, traditional methods are hopeless (it's basically impossible to fit it all into the context).

If using summary compression, it can usually be compressed to 650,000 tokens, with an annual cost of about $507 (approximately more than 3,500 RMB), but information is often lost in the process.

With MemPalace, usually only about 170 tokens need to be loaded, and only 13,500 tokens need to be loaded on demand. The annual cost is directly cut to $10, and there is no sacrifice of information details.

In this sense, MemPalace is indeed on a different level from traditional memory systems.

The author also revealed two key points for MemPalace to achieve accurate information.

One is the AI - specific language "AAAK" we mentioned at the beginning — mainstream large - scale models can directly understand it without an additional decoder.

Since large - scale models essentially operate in "token - reading mode", the less redundant information and the more concentrated the key information are, the clearer they can understand.

Especially when expressing a large number of repeated entities, AAAK can significantly compress tokens.

However, the author also said that the recall rate of the AAAK mode in LongMemEval is 84.2%, while that of the RAW mode is 96.6% — a difference of more than 12 percentage points.

So, if you want to ensure accuracy, use the RAW mode honestly; if you want to save tokens and can accept a certain amount of information loss, then use AAAK.

The other is real - time error correction, which is implemented through an independent tool called fact_checker.py (being integrated).

In the future, after it is launched, when there are contradictions in the information before and after, MemPalace can automatically perform consistency checks before generating results.

When you input: Soren completed the identity verification migration. MemPalace outputs: Identity verification migration: Task assignment conflict — the assigned person is Maya, not Soren.

When MemPalace can balance both long - term memory and accuracy, it's not hard to understand why the author dares to claim it as "the current best memory system".

By the way, its logo also looks like a simplified palace:

How to Install and Deploy?

The specific installation steps and two usage modes are as follows.

Step 1: Run the following command in the terminal to install MemPalace via pip.

  • pip install mempalace

Step 2: Initialize the world and create your own memory palace (the following command will set up a main directory to store all your memory data).

  • mempalace init ~/projects/myapp

Step 3: Mine data. Feed your project files, chat records, etc., to MemPalace to let it build an index.

There are three modes to choose from according to the data type:

(1) Mine projects: Suitable for code, documents, and notes.

  • mempalace mine ~/projects/myapp

(2) Mine conversations: Suitable for chat records exported from Claude, ChatGPT, Slack, etc.

  • mempalace mine ~/chats/ —mode convos

(3) General mining: Automatically classify content as decisions, milestones, problems, etc.

  • mempalace mine ~/chats/ —mode convos —extract general

After completing these three steps, your local memory palace is set up, and all data is stored locally without being uploaded to the cloud.

Next is how to use it.

One is the automatic mode, suitable for AIs that support MCP tool calls. Only one connection is required:

  • claude mcp add mempalace -- python -m mempalace.mcp_server

After the connection is completed, the AI will automatically call MemPalace's retrieval tool.

The other is the manual enhancement mode, mainly used in conjunction with local models.

If you just use it for daily purposes, use the following command to load the basic memory (trigger word, 170 tokens):

  • mempalace wake - up > context.txt

If you have more needs, you can retrieve relevant memories on demand through the command line and manually inject the results into the prompt.

  • mempalace search "auth decisions" > results.txt

If you don't want to do it manually, you can also use the Python API to complete retrieval and injection directly in the code.

  • from mempalace.searcher import search_memories
  • results = search_memories("auth decisions", palace_path="~/.mempalace/palace")

Oh my god! I saw a celebrity on GitHub!

Finally, let's briefly introduce the MemPalace team: an architect + a programmer + Claude.

We are familiar with the architect, the celebrity Milla Jovovich, who has contributed many classic screen images.

In addition to Alice in the "Resident Evil" series, she also played the alien Leeloo in the well - known science - fiction movie "The Fifth Element" and Lady de Winter in the action movie "The Three Musketeers"...

Wait a minute. Are you saying that such a well - known actress is now associated with an AI memory system?

All we can say is that although it's a huge contrast, the fact is