StartseiteArtikel

Ein Abschiedsbrief vom Vater des Transformers: Seit 8 Jahren braucht die Welt eine neue KI - Architektur.

新智元2025-10-27 11:01
Die nächste Transformer-klassige Architektur ist im Entstehen.

Has the "father" of Transformer "defected"? The man who sparked the AI revolution 8 years ago now finds his "own creation" too noisy and competitive! While capital is surging and papers are piling up, he shouts: It's time to abandon Transformer and rediscover curiosity.

The "biological father" of Transformer has "run away", saying he's fed up with his "own child"!

Remember the paper "Attention Is All You Need" that put the "attention mechanism" on the throne 8 years ago?

Co - author Llion Jones publicly "defected" at the TED AI conference in San Francisco recently:

AI research is becoming increasingly narrow, and he himself is going to switch his passion for Transformer to "low - power mode".

The large amount of capital and talent has pushed the research circle into a dead end. Everyone is only focused on competing for larger parameters and rushing to publish papers, and no one dares to explore new architectures.

More money, fewer ideas?

This is a wonderful chemical reaction between capital and paper - related KPIs.

Jones' stance is straightforward: After unprecedented attention, capital, and talent poured in, the research has been "narrowed down".

Why?

On one hand, investors are eyeing the returns. On the other hand, researchers are worried about being "scooped" by others. Everyone is desperately trying to make their mark in the crowded field.

So, what's the result?

Rushed academic achievements, incremental innovation, and identical paper titles.

He also brought up an old friend from textbooks: "exploration vs. exploitation".

The current industry has turned the "exploitation" knob to level 11: constantly patching the same architecture, changing its appearance, enlarging the model, and adding a sprinkle of "we're SOTA again".

But no one dares or has the time to explore truly new paths.

Jones said at the conference: Everyone will lose their jobs in the future, and that's a good thing.

A historical review: "Manual polishing" in the RNN era

Jones reminded everyone to think about the days before the emergence of Transformer!

At that time, the research circle was making endless minor improvements to RNN, just like polishing a stone.

Once Transformer made its debut, all those polishing efforts immediately became like "putting a carbon - fiber tail fin on a carriage"!

It was delicate but completely off - track and even became completely useless. Who still talks about RNN now?

So Jones is worried that we might be repeating history: Continuously milking an 8 - year - old architecture until it's barren!

We keep creating all sorts of fancy equipment for the carriage but don't look around to see if there's a spaceship waiting at the crossroads.

How did Transformer "grow up"?

It's not about KPIs, it's about freedom!

This is the answer given by the father of Transformer.

The most heart - wrenching part comes from Jones' memory:

Back then, when he was working on Transformer at Google, it was a bottom - up process of "chatting over lunch and doodling on the whiteboard".

No one stipulated how many papers had to be published, and no one was pushing for a specific indicator.

First comes freedom, then comes inspiration. It sounds simple, but it has become a scarce commodity today.

Now, even with a seven - figure annual salary, many people may not dare to "experiment recklessly".

On the first day of a new job, who doesn't want to stabilize their performance first?

So low - risk, publishable, and quick - yielding projects naturally take top priority.

Imagination? Creativity?

Let's wait a bit.

Sakana AI's "anti - involution" experiment

Llion Jones plans to turn the exploration knob back.

Jones later went to Japan to found Sakana AI.

As the CTO of the Tokyo - based startup Sakana AI, Jones said he plans to recreate the "atmospheric formula" before the birth of Transformer in the lab:

Less KPIs, more curiosity; less following the trend, more natural inspiration.

He also recommended a research motto to the team:

You should only do the research that wouldn't happen if you weren't doing it.

You should only conduct research that would not occur if you were not the one doing it.

——From engineer Brian Cheung

An example of a result born in this environment is Sakana's "continuous thinking machine", which integrates a brain - like synchronization mechanism into the neural network.

An employee who came up with the idea told Jones that in his previous employers or academic positions, he would face doubts and pressure not to waste time.

At Sakana, Jones gave him a week to explore.

The project was successful enough to get a chance to be presented at the large - scale AI conference NeurIPS.

Jones even said that in recruitment, freedom trumps compensation.

When talking about this exploratory environment, he said: "This is a very, very good way to attract talent. Think about it, talented, smart, and ambitious people will naturally seek out this kind of environment."

This move proves that freedom is more attractive than high salaries.

Smart people are often more sensitive to freedom than to money.

"It's not a breakup, it's a cooling - off period": Don't take him as an opponent

Perhaps the most ironic thing is that Transformer may be a victim of its own success.

The current technology is so powerful and flexible... that it prevents us from looking for better technologies. It makes sense that if the current technology were worse, more people would be looking for better ones.

Jones doesn't want to throw Transformer overboard.

He emphasized: There is still a lot of important work to be done on the existing technology, and it will continue to create value in the next few years.

Given the current density of talent and resources in the industry, we can definitely "afford" more exploration.

The power of Transformer is blocking our impulse to find something "better".

If the existing technology were a bit worse, people would be more likely to look for the next surprise everywhere.

Change the "arms race" to "unboxing and sharing"

At the end, Jones took an open - minded stance: This is not an arena of "winner - takes - all", it's a collective puzzle - solving process.

If everyone can turn up the exploration knob a bit and publicly share interesting findings, the path to the next "Transformer - level" breakthrough may be much closer than we think.

It's still unknown whether the AI power - holders (Is it OpenAI, Google, or others?) will heed this call.

But Jones issued a sharp reminder: The next Transformer - level breakthrough may be just around the corner.

After all, he has worked in the Transformer field longer than almost anyone else.

He'll know when it's time to turn to a new direction.

The Eight Sons of Transformer

Transformer has laid the foundation for today's AI era. Almost all basic models are built on it.

The simple output mode of "token by token" has become the new - era AI magic with the boost of computing power.

Transformer has given rise to many cutting - edge products such as ChatGPT, Gemini, and Claude.

More importantly, it has truly led humanity into the era of generative AI.

The fates of humans and generative AI began to intersect at 17:57 on Monday, June 12, 2017.

The influence of Transformer continues!

As of today, the paper has been cited over 180,000 times!

It makes people wonder where the other co - authors are besides Jones.

The "biological fathers" who once jointly created Google's most powerful Transformer have now gone their separate ways.

· Ashish Vaswani

Co - founder & CEO of Essential AI

He said he hopes to make Essential AI "the Western DeepSeek" (interview on June 17, 2025).

· Noam Shazeer

Has returned to Google; Co - leader of Gemini technology

· Niki Parmar

Technical staff at Anthropic.

Previously co - founded Essential AI with Vaswani and was an early co - founder of Adept

Joined Anthropic at the end of 2024/beginning of 2025.

One of the co - founders of Essential AI.

· Jakob Uszkoreit

Co - founder & CEO of Inceptive Nucleics

He took the stage at TED AI San Francisco in 2025 to share new ideas on "how AI can bypass traditional science"; He continues to advance in the direction of "biological software".

· Llion Jones

Co - founder & CTO of Sakana AI

Foreign media reported that Sakana AI is in talks for a new round of financing, with