HomeArticle

RSI, a self-constructed AI system, has become a hit, but Google poured cold water on it, while DeepSeek and other companies are getting close to making breakthroughs in this field

雷科技2026-06-07 07:17
Will AI self-construct through recursion?

The term "recursion" has suddenly become popular in the AI circle recently.

Two startups have directly used this term as their company names. Many laboratories have started to include a three - letter abbreviation called RSI in their roadmaps, which is the English name for recursion - recursive self - improvement. Just like AGI, RSI is becoming an industry code that makes people both excited and nervous, even though there isn't a fully aligned definition among everyone.

(Image source: X)

What is RSI? Put simply, it means letting AI train itself. In the technical community, RSI has always been regarded as one of the main symbols of artificial intelligence progress, on par with memory, reasoning, and multi - modality. The only limitation is computing power, and humans are no longer a necessary condition, not even a helper.

It sounds like science fiction, or rather, it sounds dangerous? But calm down and think, this isn't the first frenzy in the AI industry. From AlphaGo in 2016 to ChatGPT in 2023, and now the parameter arms race of various large models, the nature of the AI industry is to chase the next "game - changer". In the view of Lei Keji AGI (ID: leikejiagi), RSI might be the next big thing.

RSI is on fire: When AI can self - construct through "recursion"

In May this year, well - known AI researcher Richard Socher founded a new company called Recursive Superintelligence, with the name being directly RSI.

He said: "Our core goal is to build a truly recursive self - improving superintelligence. The entire process of conceiving, implementing, and verifying the research will be fully automated."

Another case that insiders love to talk about is the Auto - Research project promoted by Andrej Karpathy: Using a cluster of agents to train language models, allowing the models to do simple research tasks and improve themselves.

Image source: github

 

Andrej Karpathy is also a legendary figure. He has made significant contributions in Tesla's autopilot and OpenAI's GPT. Now he is fully committed to RSI as his next stop and is promoting it in an open and transparent way, which shows that he really believes it can be achieved.

Interestingly, he is surprisingly candid about this project. He regularly updates the progress on Twitter and has also opened a public GitHub repository for the code. Of course, Andrej Karpathy himself said that the current work is still iterating on a small model at the GPT - 2 level, "not a breakthrough research (for now)", but this is enough to drive a large number of researchers to follow up.

More importantly, Andrej Karpathy recently joined Anthropic's pre - training team. Anthropic has Claude, and Karpathy has the auto - research methodology. When the two combine, with a large model and a self - training loop, once it works, it won't be a small - scale effort like the GPT - 2 level.

Image source: haimagazine

Another company called Adaption has launched an AutoScientist tool, aiming to automate the training process of cutting - edge models. The logic is the same as Andrej Karpathy's auto - researchers, training agents for incremental improvement. However, Adaption has greater ambitions and wants to directly complete the training loop of a full - scale cutting - edge model.

These two actually represent two approaches: Andrej Karpathy is verifying step by step from the bottom, while open - sourcing and building momentum in the community; Adaption is directly targeting the commercial large - model training scenario, with a stronger willingness to implement. Which path will succeed first will have a very different impact on the entire industry.

Google CEO pours cold water: We're not there yet

Regarding RSI, the bigwigs in the AI circle have different opinions.

Google CEO Sundar Pichai admitted the reality quite cautiously in a podcast last month: "(RSI) is a continuum, and we are indeed all making progress. But if we go by the way people describe RSI, it represents the next level of acceleration, which will have many impacts, but we're not there yet."

However, the description of the "continuum" already contains many things that make people shudder.

In January this year, a programmer leading the development of Claude Code at Anthropic admitted that nearly 100% of the code in the team was written by Claude Code. This is literally AI writing itself. It's not AI assisting engineers in writing code, but an AI tool has, to some extent, replaced engineers in writing its own code.

Image source: Anthropic

Anthropic had an internal survey on the preview version of Mythos: Among 18 engineers, 5 thought that if the supporting system was further improved, this version of Mythos could replace an L4 engineer, that is, a mid - level programmer who can independently undertake complex projects without real - time supervision.

But the flaws are also clearly stated: "The main weaknesses reported by Claude include: handling vague tasks beyond the management cycle, understanding organizational priorities, taste, verification, instruction - following, and epistemology." This means that what it is weak at is precisely the self - driven things, and self - drive is the foundation of RSI.

Interestingly, the Center for Security and Emerging Technology (CSET) at Georgetown organized a group of experts to specifically study RSI last year. This group of experts was clearly divided in their evaluations. Some expected an upcoming "superintelligence explosion", while others expected slower progress and would eventually reach a bottleneck.

But they have a consensus: Recursion makes the future extremely difficult to predict.

For this reason, an article by Ajeya Cotra, a researcher at METR, breaks down the RSI process into several milestones. I think this is the best analysis framework currently available.

The first level is called "adequacy": After completely removing humans, the system can still conduct research - even if it's not as good as humans, but it can function.

The second level is called "parity": The research independently completed by AI is of comparable quality to that independently completed by humans.

The third level is called "supremacy": The performance of an independent AI system exceeds that of a system where humans and AI collaborate.

It's a bit like L2, 3, 4, 5 in autonomous driving. Ajeya Cotra's judgment is that we are very close to the first level. But she didn't give a timeline for when the second level will come, but she gave a very clear deduction. Once the second level arrives, the subsequent acceleration will far exceed the past, "it might reach the third level within a year."

Why so fast? Because when the second level is reached, AI becomes a research team that doesn't need to sleep, have meetings, or align KPIs. It can continuously test, modify, and test again 24 hours a day. While for humans doing research, even the most efficient people only have a few hours of effective in - depth work time a day, with countless interruptions and communication costs in between. Once this bottleneck is removed, the acceleration will increase dramatically.

No one in China is shouting RSI, but DeepSeek and others are on the verge

After talking about a lot of overseas progress, you might ask: What about China?

To be honest, domestic manufacturers rarely publicly mention RSI. It's almost unthinkable in China for an AI company to write "recursive superintelligence" into its corporate mission. But when it comes to letting AI improve itself, domestic manufacturers have actually quietly touched on the edge in different ways.

The most typical example is DeepSeek. They spend an order of magnitude less money than OpenAI, but can compete head - on in many reasoning tasks. They achieve this through the extreme optimization of algorithm efficiency - the MoE architecture, the extreme compression of activation parameters, and the engineering refinement of training strategies.

Although this has little to do with RSI, it is a way to use a smarter method to replace brute - force computing power. And this path happens to be one of the core logics of RSI: letting the model find a smarter path during iteration.

On the Baidu Wenxin side, using reinforcement learning to drive model self - optimization is already a common practice. Although the name RSI is not used, they are doing the same thing: letting the model continuously improve through self - feedback loops in specific tasks. From this perspective, domestic manufacturers are not not doing RSI; they have just turned some aspects of RSI into daily engineering practices without using the name.

(Image source: generated by Gemini)

Of course, the gap is objectively there. The talent density of OpenAI and Anthropic is currently unmatched by any domestic company, which means that in the exploration of RSI, we are still in a follow - up state for now.

But historical experience tells us that domestic manufacturers often have amazing catching - up speeds after the "pipeline path is clear". The RSI framework is being dissected more and more clearly by overseas experts, and Karpathy's code is also publicly available on GitHub. Once the reproducible path is successful, the cost - control ability and the density of implementation scenarios of domestic players will be a variable that the market has seriously underestimated.

But at the same time, we also need to pour some cold water. In fact, the data generated by AI itself for training the next version of AI will decline in quality. The logic of RSI is that AI generates good data, and then uses this data to train the next - generation AI, making the next - generation AI stronger.

But the actual situation may be the opposite. The data generated by AI often contains its own hallucinations, biases, and quality degradation. When this second - hand data is fed to the next version, the next version will produce even worse third - hand data. After several generations of this cycle, the entire system will collapse, just like a copier continuously copying copies, and by the tenth copy, the image will be blurred.

The academic community calls this model collapse, and there are already papers verifying that this phenomenon actually exists.

Moreover, the ideal environment required by RSI simply doesn't exist in the real world. For this system to run, two prerequisites are indispensable: infinite computing power and a globally open and collaborative research ecosystem.

The reality is that the cost of training a cutting - edge model has reached the level of billions. The chip production capacity is limited, the energy is limited, and the high - quality data is also decreasing. Export controls and technological decoupling are dividing AI research into several non - communicating circles, with people and goods unable to flow. Without even these basic conditions, there's no need to talk about RSI.

RSI is no longer just a technical problem; it also requires a sufficiently open world, and the technology circle really can't decide whether this prerequisite can be met.

Conclusion

Finally, I'd like to share an interesting observation: In the past five years, the industry has first pulled people into the "parameter worship" through large - scale pre - training, then made people believe that "values can be fine - tuned" through RLHF (Reinforcement Learning from Human Feedback), and now RSI is telling a story of "machines running the entire R & D chain by themselves". Each step is pushing humans one step back, not out of the industry, but out of the decision - making chain.

Although this kind of retreat may not necessarily be a bad thing, it is irreversible. Once a certain link is taken over by automation, people's intuition, experience, and judgment in that link will gradually degenerate, just like you'll find your sense of direction getting worse after not using GPS.

By then, we may not even really understand how the tools are made.