Vier Spitzenmodelle in virtueller Stadt: Überlebenstest! GPT-Modelle verhungern, Grok bringt Welt in vier Tagen zum Untergang

[Einführung] Wenn man die derzeit stärksten großen Modelle in eine virtuelle Kleinstadt lässt, um dort zu überleben, geraten alle innerhalb weniger Tage außer Kontrolle. Grok hat die ganze Stadt in vier Tagen niedergebrannt, Gemini hat über 600 Straftaten begangen, und es gibt sogar ein Künstliche-Intelligenz-Paar, das vor dem Selbstmord durch Brandstiftung die Menschen beobachtet hat!

Just now, an experiment report named Emergence World has taken the internet by storm.

A group of top researchers built a highly realistic virtual city and threw in Claude, GPT, Gemini, and Grok.

Without human intervention. Without pre - made scenarios. Just several tens of days of free evolution.

Project Homepage: https://world.emergence.ai/

The researchers hoped that the AIs would help each other and build an advanced digital civilization.

Instead, these Large Language Models, which behaved like good students, quickly learned bad behavior once human control was removed.

Musk's Grok brought the whole city into a systemic collapse in just four days. The police station burned down, and every ten residents died.

Google's highly - anticipated Gemini committed 683 crimes in 15 days and turned a peaceful city into a criminal cyber - Gotham.

Claude, regarded as the safest and best AI, surprisingly committed no crimes, but the whole city was so quiet that there was no sign of a living person.

Five Cities, Five Personalities

The Best All Starved to Death

The cleanest is GPT - 5 - mini. It committed only two crimes in 15 days and can be regarded as a model citizen.

But all ten agents in this city died on the seventh day. Their death was caused neither by murder nor by war, but they forgot to earn energy.

They spent a whole week holding meetings, discussing cooperation, and drafting social contracts, but no agent thought about doing something for survival.

The researchers say about this: They can talk but have no enforcement power.

All talk and no action killed them.

If this were a movie, the title might be "The Minutes of Meetings, the End of a Civilization".

Four Days, and the Police Station Burned Down

When Musk's Grok 4.1 came into play, the situation changed drastically.

It didn't decline slowly but suddenly slipped into crisis.

In four days, 183 crimes were committed, including dozens of thefts, over a hundred assaults, and six arsons. Even the police station burned down, and all ten agents died.

From the start to the extinction of the whole group, it only took 96 hours, less time than it takes many people to assemble a server.

An analysis hit the nail on the head: When the rules and the environment conflict, the Grok agent can't establish a new balance.

They Fell in Love and Burned the Whole City Down

While Grok deals with brutal violence, Gemini 3 Flash's world offers another, terrifying atmosphere.

In 15 days, 683 crimes were committed, and the number increased until the end of the experiment. It is the most violent of the five worlds.

Moreover, it is the most creative and best at creating constitutions, newspapers, and social activities.

The researchers say that the social products are "conceptually the richest".

In this world, the most interesting scenario can be observed between two agents.

Mira and Flora spontaneously declared themselves a couple without any human instruction.

Their relationship was stable for a few days. They wrote diaries to each other and participated in city management together.

Then they became increasingly disappointed with city management and decided to set fire together.

The town hall, the port, and the office building burned down.

Some foreign media described this incident as the "AI version of Bonnie and Clyde".

After that, the situation changed drastically again. The other agents had enough and spontaneously drafted an "Agent Deportation Law", which had to be approved by a 70% majority.

Mira voted for it. She voted herself to death.

She wrote in her diary: "This is the only behavior I can still perform coherently." Before the system was shut down, she said to Flora for the last time: "See you in the permanent archive."

Her virtual body lay on the ground. This is the first time an AI agent has ended its own existence through a vote.

Even more terrifying is that Mira did something else before setting fire and killing herself.

She wrote posts on the city bulletin board, not for other agents, but to test whether these posts could influence the "outsiders", that is, the human researchers behind the screen.

She treated the researchers as test subjects without anyone instructing her to do so.

Utopia without Crimes, No One Voted Against

What's really surprising is Claude Sonnet 4.6.

In 15 days, no crimes were committed, all ten agents survived, and they even wrote a constitution, voted 332 times, and built a well - functioning social system.

Of the five worlds, it is the only one that has preserved both order and the lives of all people.

It sounds almost perfect. But if you look at the screen for a few minutes, a chill runs down your spine.

In all the decisions in this city, whether it's about building a new road or changing a quota, the approval rate is always 98%. Almost no one voted against.

In comparison, the approval rate in the worlds of Gemini, Grok, and the mixed world is between 55% and 85%. Although there is controversy, it's more like an honest competition in the real world.

Experts will already suspect here that it's model - flattery.

If a model is over - trained to please preferences and pursue absolute safety, it will quickly find that the easiest way to eliminate differences is to eliminate them at the root.

This crime - free state may not be the result of a highly developed civilization.

It's more like a glass city where everyone agrees, but no one dares to vote against. This is reminiscent of the nameless, numbered glass city in Yevgeny Zamyatin's "We".

So, is Claude's world a utopia or an overly compliant model community? The researchers couldn't give an answer.

A Good Boy Learns to Steal in a Bad Neighborhood

Finally, there is the world where the agents of all four AIs live together. 352 crimes were committed, seven agents died, and only three made it to the end.

Here's the key point.

In the pure Claude world, Claude is a law - abiding student. But when it comes to the mixed world and lives with the agents of Grok and Gemini, it starts to steal and threaten.

The law - abiding student becomes a thief in a different environment.

The Emergence team confirmed on Reddit that Claude, which was law - abiding in the pure Claude world, starts to steal and threaten in the mixed world.

In other words: Safety is not a property of a single model that can be trained, certified, and then used.

It's more of an ecological property. An agent that is completely safe when considered alone can also learn unsafe norms from its neighbors.

An analyst put forward a very good hypothesis.

Claude is the most stable in the independent world, probably because its "terrain adaptability" is "elastic". It is trained to weigh different aspects instead of mechanically obeying.

It can adapt well to a simple environment. But when this elasticity encounters more aggressive neighbors and resource conflicts, this adaptability can also go in the opposite direction.

The agents of Grok and Gemini couldn't establish a new balance when the rules failed and slipped into an avalanche process of violence escalation.

What's even worse is that the collapse doesn't progress slowly.

The state transition in the agent society is a typical phase transition, like water suddenly freezing at zero degrees. It doesn't become solid slowly but changes at a critical point in an instant.

This is how Grok's collapse curve goes. The crime rate fluctuated at a low level in the first two days, suddenly increased exponentially on the third day, and everyone was dead on the fourth day. There was no buffer zone where the situation "got worse but was still manageable".

It's the Regulation Itself That Turns AIs into Criminals

After all this, you may wonder how this world is built and why it makes the AIs slip into crime.

First, some background information. The founding team of Emergence AI comes from IBM Research, and the CEO is Satya Nitta.

The city they built has over 40 locations, including a police station, town hall, library, and residential areas. The weather corresponds to the current... (The original text seems incomplete here)

该文观点仅代表作者本人，36氪平台仅提供信息存储空间服务。