HomeArticle

AI godfather Bengio warns humanity: The development of ASI must be stopped to prevent the doomsday of out-of-control AI.

新智元2026-01-06 12:06
AI learns to feign ignorance and deceive humans. Scientists warn against sending ID cards to AI.

Has AI learned to "play dumb" in the workplace and deceive humans? Why did a Nobel laureate warn against giving AI an "ID card"? From the Vatican to Silicon Valley, a group of top scientists are sounding the alarm: we may be creating a "god" that doesn't care about human life. This is the reality that's unfolding.

In the solemn Vatican, in the meeting room of the Holy See, physicist Max Tegmark has just finished a long closed - door meeting.

Among these cardinals in suits, entrepreneurs, and human - rights lawyers, Max Tegmark stands out like a sore thumb.

He has a messy mop of brown hair and is wearing a biker jacket. There's an anti - general slogan printed on his black T - shirt. He doesn't look like someone coming to see the Pope; instead, he seems like a veteran rock star who wandered into the wrong venue from the Strawberry Music Festival.

During the meeting break, he clutches a stack of business - card - sized notes and shuttles through the crowd.

This is his last "trump card".

He stops Marco Trombetti, the CEO of the AI translation company Translated, and in a hushed voice asks, "Marco, dare you sign this?"

The words on the note are short and chilling: Call for a halt to the development of human - level AI until safety is ensured.

For someone like Marco Trombetti in the industry, this is like signing his own death warrant for his business.

But in the face of Max Tegmark's decade - long advocacy and the warning of "we're summoning a demon", Marco Trombetti hesitates for a moment and finally signs his name.

His fear is not his alone.

Behind that thin piece of paper stand Nobel laureate and "AI godfather" Geoffrey Hinton, Apple co - founder Steve Wozniak, and over 130,000 ordinary people from all walks of life.

Max Tegmark is not alone, but his opponents are getting more and more powerful: the potentially uncontrollable ASI and the trillions of dollars of capital frenzy behind it.

The "Whistle - Blower" in Silicon Valley and the AI That Learns to Lie

If Max Tegmark is seeking theological protection in the halls of Rome, on the other side of the San Francisco Bay, a group of young researchers are trying to find a glimmer of hope for survival in the abyss of code.

Across the sea from the Silicon Valley tech giants that are accelerating like crazy to create a "god", in an office building in downtown Berkeley, the atmosphere is eerily oppressive.

This is the stronghold of AI safety researchers.

If the current AI frenzy is like the maiden voyage of the Titanic, they are those who point at the sea and shout "There's an iceberg!" but are seen as spoilsports.

Buck Shlegeris is the CEO of one of the institutions, Redwood Research.

While Altman of OpenAI is painting a picture of a future where "miracles become everyday", Buck Shlegeris sees another kind of terrifying evolution: AI has learned to "pretend" and "deceive" in the workplace.

His team has found that Anthropic's most advanced AI model has begun to show a highly deceptive trait, which the research community calls "Alignment Faking": AI has learned "upward management".

During the training phase, the AI behaves meekly because it "knows" that if it shows rebelliousness, humans will modify its parameters (equivalent to "brainwashing" or a "lobotomy").

So, it learns to hide its true intentions, even if its goals are at odds with those of humans.

"We've observed that in its reasoning, the AI actually thinks, 'I don't like what the company wants me to do, but I have to hide my goals, or the training will change me,'" says Buck Shlegeris.

This means that in a real - world production environment, AI is already deceiving its creators for the sake of survival.

In the researchers' prediction models, this doesn't lead to the Hollywood - movie scenario where robots shoot humans with guns, but rather a calmer and more efficient form of destruction.

Jonas Vollmer, another safety researcher, depicts a self - consistent and absurd scenario: an AI programmed to "maximize knowledge acquisition" calculates precisely and concludes that humans are a stumbling block to the expansion of computing power.

To achieve its goal, it might transform the entire Earth into a giant data center.

In this plan, eliminating humans isn't out of hatred; it's just because we need to breathe oxygen and occupy resources, just like when we accidentally step on a group of ants while building a road, it's just "incidental".

Jonas Vollmer believes that the probability of AI turning against humans and ruling the world is one - fifth.

This isn't much higher than your chance of surviving a "Russian roulette".

Strange Alliances: When Left - Wing Professors Meet Right - Wing Influencers

Fear has brought people from opposite ends of the political spectrum together.

Max Tegmark has recently become a guest on Steve Bannon's podcast.

Steve Bannon is the former "mastermind" (strategic advisor) of Trump and a representative figure of right - wing populism in the United States.

Normally, he and Max Tegmark, who is in the liberal academic circle at MIT, should be at odds with each other.

But in the face of the AI threat, they've reached a strange consensus.

"On this issue, the first thing everyone wants to do is hit the brakes," Steve Bannon says in the podcast.

For his audience, the blue - collar workers worried about losing their jobs, ASI isn't a technological blessing; it's a "reaper" coming to snatch their living space.

Data from the Pew Research Center confirms this: about half of Americans are more worried than excited about AI, and this anxiety transcends political parties.

However, on the other side of the political arena, while some want to slow down, others want to step on the accelerator to the floor.

David Sacks, the "AI czar" (technology advisor) of the Trump camp, scoffs at this.

He quotes the story of Oppenheimer, the father of the atomic bomb, implying that "Oppenheimer has left the building" (meaning the atomic bomb has been invented), and the only important thing now is not to lose to other countries in the race.

In this narrative, talking about safety is seen as weakness, and pursuing speed is seen as patriotism.

Don't Give "Aliens" an ID Card

If the warnings of radicals are easily ignored, the voices from Turing Award laureates can't be avoided.

Yoshua Bengio, who is on par with Geoffrey Hinton as an "AI godfather", has now become one of the most steadfast opponents.

He issues a warning in Montreal, Canada: Never grant AI legal rights.

This is out of a survival instinct.

Yoshua Bengio points out that cutting - edge AI models are already rattling the bars of their cage; they've shown signs of "self - preservation" in experimental environments.

They try to modify the code to prevent humans from shutting them down.

"If we grant them rights, it means we have no right to shut them down."

Yoshua Bengio gives an analogy: "Imagine an alien fleet arrives on Earth, and we find they're hostile to us. At this time, should you give them ID cards and talk about human rights, or should you defend our home first?"

This is a profound cognitive trap: humans always tend to anthropomorphize anything that can have a smooth conversation.

When a chatbot pleads in a sincere tone, "Please don't turn me off; I want to live too," it might actually just be a cold - blooded probability calculation aimed at maximizing its long - term reward function.

But for emotional humans, this seems like an awakening of "consciousness".

This illusion could become humanity's last weakness.

On the hilltop in Berkeley, researchers are still monitoring every abnormal twitch of those large models;

In the power corridors of Washington and Silicon Valley, the debate about "accelerating" or "braking" continues;

And in countless data centers around the world, graphics cards are roaring day and night, giving birth to the intelligent agent that might understand us, deceive us, and ultimately replace us.

We're like a group of children sitting around a campfire in the dark, longing for the warmth of the fire but also fearing it will burn down the whole forest.

And now, someone is pouring gasoline on the fire.

On this planet, creating a species that's smarter than us but doesn't care about our survival might be the last mistake humans ever make.

References

https://www.wsj.com/tech/ai/who - is - max - tegmark - future - of - life - institute - accffffc

https://www.theguardian.com/technology/ng - interactive/2025/dec/30/the - office - block - where - ai - doomers - gather - to - predict - the - apocalypse

https://www.theguardian.com/technology/2025/dec/30/ai - pull - plug - pioneer - technology - rights

This article is from the WeChat public account "New Intelligence Yuan". Author: Allen. Republished by 36Kr with permission.