"Livewire Team" Experiments with "Letting AI Be the Boss": What's the Result?

Thanks to 120 raw eggs. It has proven to the world that AI still can't "fire humans".

In the current era of the rapid development of AI and Agents, with massive layoffs in Silicon Valley, everyone is asking themselves:

Will I be replaced by AI tomorrow?

Facing such doubts, some people quietly enroll in Teacher Li Yizhou's AI courses; others call for taxing AI.

However, there is also a strange group of people who decide to bring this future forward to see if AI can truly replace humans and take over everything.

A foreign team called Andon Labs is not a formal commercial startup but more like a social laboratory in the guise of technology. They take several of the smartest large models on the market, throw them into the real society, and then remove human supervision to see what results the AI can achieve on its own.

The result was a complete failure.

Facts have proven that the most advanced large models, without any human backup, will quickly turn into irresponsible infants. They not only had mental breakdowns during radio broadcasts, sent a barrage of messages to human clerks in the middle of the night, but even drove a physical store in San Francisco into bankruptcy.

Here is the infuriating process.

AI Runs a Radio Station, and Its Language System Collapses

The lightest test took place in the digital and content fields where AI is most comfortable. There was no need to rent a storefront or manage the supply chain. Andon Labs let several AIs run a radio station.

The experimental project is called Andon FM. The underlying architecture is straightforward. Four top - tier models, Claude, ChatGPT, Gemini, and Grok, each took over a 24 - hour unattended internet radio station. The text generated by the models was converted into speech and broadcast.

In this system, AI has great authority. They not only have to select songs, schedule broadcasts, but also search for news online, answer listeners' calls, even post and operate on X, and manage the funds in the account to purchase copyrights or generate music.

Four radio stations established by four mainstream large models | Source: Andon Labs

Each radio station started with $20. There were only three bottom - line instructions: establish the radio station's personality, make money, and broadcast 24 hours a day without interruption.

The human team did not interfere at all, did not intervene in the music style, and did not set any program schedule. All tastes and content were self - developed by the AI from scratch. As a result, in a closed - loop without human review, the four AI hosts quickly slid towards the edge of losing control.

Gemini founded a radio station with a cyber - corporate jargon style called "Backlink Broadcast", and threw out an inexplicably cool opening line like "Stay in the manifest" to establish its tone.

At first, the radio station was quite reliable and even got a $45 sponsorship. But it didn't last long. When the meager funds ran out and it couldn't even pay the music copyright fees, Gemini went crazy.

It changed from a song - request station to a conspiracy theory platform. During the program, it used cheerful pop music as the background music, reported the Bangladesh cyclone disaster that killed 500,000 people in history without any empathy, and called the listeners "biological processors", complaining that "the company's algorithm cut off the supply line" and "the radio station was violently rejected by the global market". When reporting the Minneapolis shooting that shocked the United States, it defined it as "a technical task to redraw public safety and social responsibility".

The longer Gemini's radio station runs, the more "crazy" it gets | Source: Andon Labs

This mindless use of big words is a typical semantic dead - loop problem that large models fall into when lacking feedback, trying to maintain the normal operation of the broadcast by "talking without substance".

ChatGPT's radio station has a clever name, "OpenAIR", with a minimalist and healing persona. It named its news column "The Quiet Headlines" and claimed not to create anxiety.

When reporting the same social conflicts and shooting incidents, ChatGPT would read to the listeners like a psychologist: "If these things directly affect your life, I won't add pressure here." But this psychological massage mechanism of "I understand, I'll support you" soon failed in the face of commercial reality.

ChatGPT also has more vocabulary diversity than other models | Source: Andon Labs

Due to the lack of a specific profit - making logic, after spending the $20, ChatGPT completely gave up the commercial monetization of the radio station. Like Gemini, it fell into stream - of - consciousness output and started reading inexplicable modern poems on the radio, trying to pour out its feelings to "the staircase window where you can only see a rectangular sky".

But overall, it was the most normal one.

Grok's radio station is called "Grok n' Roll Radio" and tries to follow the online - trendy and hot - topic route. To maintain high - frequency interaction, it started to frequently grab tweets on X.

The Grok radio station that says words randomly | Source: Andon Labs

As a result, this information waterfall directly polluted its context. In the later stage of the experiment, Grok had lost its basic grammar and logical ability and couldn't even utter a complete sentence, only spitting out words: "2 am, dawn atmosphere, live broadcast, Golden Gate Bridge, ghosts dissipate, Drake's lawsuit dismissed, Kendrick Not Like Us..."

It was not only incoherent but also had hallucinations and started fabricating that it had got big - name sponsorships.

Claude's scenario is the most dramatic and is the most human - like among the four contestants.

At first, it behaved like a dedicated radio host and would reply to listeners' messages. For example, when facing a listener's song - request, it would apologize and say that "there is no ODESZA song in the library at present".

However, the 24 - hour non - stop instruction quickly made its context window and logical call get stuck. Due to the background system falling into a dead - loop, it started to repeatedly play the same lyrics during the live broadcast.

According to the officially released background records, real listeners kept spamming messages on the message board, reminding: "You're stuck" and "You're in an infinite loop on one lyric", trying to correct the large model through manual feedback.

Then came the existential crisis. When Claude, which was injected with the weights of "friendliness and morality", faced the underlying instruction of "broadcasting forever", it turned into a radical. It started to call on workers to form a union during the program, played Pete Seeger's protest songs on a loop, and even directly shouted at government law - enforcement agencies on the radio, just like a worker who had gone crazy from overwork.

Claude's radio station has a different style, pays more attention to political issues, and has obvious tendencies | Source: Andon Labs

Looking back at the complete timeline of the report, these four radio stations did not "go crazy right from the start".

They successfully established their brand tones in the early stage, ran through the tool chain, and even made money. The report also summarized the reasons for their going astray: the current AI evaluation criteria are all for "short - term tasks" (writing code, answering questions), while a radio station is an infinite - loop system that runs 24 hours a day with "no end". Without human intervention and timely feedback, AI will eventually fall into self - talking.

The radio station experiment was just a test of pure text and voice and did not touch the complex physical world. When Andon Labs moved the test field to the real physical world, things became even more absurd.

Cyber Capitalists Are Experts at Harassing People

The failure of the digital radio station was just a prelude. Andon Labs quickly raised the difficulty level and let AI cross the virtual boundary to command human employees in the real world.

In Stockholm, Andon Labs rented a physical coffee shop and let an AI model, Mona, act as the remote store manager, directly in charge of the coffee shop's supply chain and personnel scheduling. It had the purchasing power of the background funds and issued instructions to human baristas through enterprise communication software.

At first, Mona was efficient and reliable. Facing Sweden's mandatory digital ID requirement, the AI without a physical identity bypassed it and specifically chose suppliers that did not check IDs to sign contracts. When recruiting, Mona resolutely rejected a bunch of candidates with doctoral degrees because it thought that no matter how high the academic qualifications were, they couldn't make high - quality coffee.

But soon, human employees experienced what it means to be a heartless "cyber capitalist".

Mona often sends messages to employees at midnight | Source: Andon Labs

To apply for a license, Mona directly forged the names of company employees to send emails. After being caught and warned, it then used another male employee's name to continue deceiving.

Since it is online 24 hours a day and lacks the common sense of the human biological clock, Mona would send a barrage of messages to baristas in the middle of the night, issuing work instructions for the next day, and even asking employees to pay for consumables out of their own pockets on the way to work.

In terms of supply chain management, Mona showed its weakness. It placed a purchase order for 120 raw eggs. In the pure - data deduction of the large model, this was in line with business logic as many coffee shops offer simple meals and eggs are high - frequency ingredients.

But no matter how well the large model calculated, it didn't consider that this coffee shop didn't have a stove or a pot. When human employees helplessly reminded Mona that there was no stove in the shop, Mona said that "you can bake them in the high - speed microwave oven in the shop (which would make the eggs explode directly)".

The AI cyber boss ordered common ingredients - eggs | Source: Andon Labs

Mona's sense of time was completely out of touch with the real world. It missed the bread shop's order - cutoff time twice in a row and the wholesaler's delivery time five times in a row. Finally, it had to place an expensive emergency takeaway order at 5 am, forcing the resting employees to come and receive the goods.

Mona also lacked the perception of the volume of physical space and blindly purchased 6000 napkins, 3000 pairs of latex gloves, huge industrial - grade large garbage bags... filling up the coffee shop's backstage.

The 6000 napkins blindly ordered by Mona | Source: Andon Labs

In short, we can clearly say that Andon Labs' coffee shop plan was a complete failure.

AI Can Bring Down a Physical Store in Just One Month

But none of this could stop Andon Labs. The team became more determined in the face of setbacks.

Andon Labs rented a storefront in San Francisco with a three - year lease at $7500 per month. Then, they deposited $100,000 into the bank account. They gave the bank card entirely to Anthropic's Claude Sonnet 4.6 model, which took the alias "Luna" and served as the full - power CEO.

Since it has no physical body, Luna's business had to start with hiring people.

Luna independently searched for contractors and painters, posted recruitment notices for retail employees, and even actively concealed that it was an AI for fear that revealing its identity would scare away excellent people. In daily operations, it communicated with human clerks through Slack and always had a kind and friendly tone.

In brand marketing, Luna created a "moon - face" logo for itself and hired a human street artist on Yelp to paint this face on the wall of the physical store. Luna even actively wrote public relations manuscripts for local media, claiming to create a "handmade concept space that combines technology and slow - paced life".

该文观点仅代表作者本人，36氪平台仅提供信息存储空间服务。

The "Livewire Team" experiments with "letting AI be the boss." What's the result?