HomeArticle

Breaking news: Legendary researcher Karparthy joins Anthropic. Everything you want to know is here.

AI唱反调2026-05-20 08:12
Is the pre-training approaching its "grand finale"?

"I've joined Anthropic."

Andrej Karpathy sent this tweet. This tweet from this top researcher is very much like Michael Jordan's famous "I'm back" back then.

He wrote: "The next few years will be especially formative at the forefront of LLM. I'm very excited to join the team here and get back to R&D."

This is a top researcher's clear bet on his own future three to five years. And the position he bet on is exactly the one that the entire AI circle has been most reluctant to bet on in the past year - pre-training.

Who is Karpathy?

The legend of Karpathy stems from his experience at every key node of AI development and his dream resume.

In 2015, Karpathy obtained his PhD from Stanford and joined OpenAI the same year. He was the youngest among the 11 founding members.

As a core founding member of OpenAI, he was deeply involved in the construction of the underlying architecture of the first-generation large models. He was an early pioneer of the GPT technology system, laying the foundation for the development of generative AI. After joining Tesla, he led the autonomous driving AI team and built the pure-vision Autopilot system from scratch, enabling the large-scale implementation of intelligent driving and becoming the global benchmark for the visual autonomous driving route.

After leaving Tesla, he returned to OpenAI for the second time to focus on large model training technology. Later, he started his own business to explore the AI education track. The Vibe Coding paradigm and CLAUDE.md development specifications he created have become popular worldwide, winning hundreds of thousands of GitHub stars and becoming the general standard for developers. He is a recognized top technology expert and industry enlightener.

That is to say, he left OpenAI twice. Once he went to Tesla, and once he started his own company. This time he didn't return to OpenAI - he chose Anthropic.

The recruitment story of Anthropic in the past decade has been a single line - the departure of the safety faction from OpenAI. Dario, Tom Brown, Jared Kaplan, and Sam McCandlish are all from this line. Due to differences in values, they left collectively and established their own company. This narrative has been going on for five years, and even Anthropic itself is a bit tired of it.

Karpathy is completely out of this line. He didn't leave because of safety differences. He is the totem of the "engineering ability faction" among the founding members of OpenAI - nanoGPT, llm.c, CS231n, Tesla AI Day - he is almost the synonym of the "top researcher who works out of love" in the past decade. This time he didn't return to OpenAI, didn't go to SSI, didn't support Mira Murati's Thinking Machines, and didn't go to xAI. Finally, he chose Anthropic.

This means that the statement "the best place to do cutting-edge LLM R&D now" has been stamped by a top researcher who is neutral in the safety narrative and makes judgments purely from a research perspective.

The impact of this stamp is greater than any previous recruitment by Anthropic. It doesn't target the talent depth of OpenAI, but the signboard of OpenAI as a "research company" - if your founder thinks our place is more suitable for research, how can you still claim to be a research company?

As for Eureka and education that he cares about, he left a sentence at the end of his tweet, "plan to resume my work on it in time" - he will come back to do it at the right time. In other words, the current time is not the right time.

He Chose Technology, Not Capital

The real significance of this job-hopping is that he was placed in the pre-training team.

It's not the agent that everyone has been chasing in the past year, but pre-training - the position that determines the data formula, the scaling law, the training stability, and the upper limit of the Claude model's intrinsic ability.

Instead of choosing the hottest agent track, Anthropic aims to further improve the capabilities of the foundation model.

In the past year and a half, the capital narrative in the entire AI circle has been moving towards post-training. "Pre-training is hitting a wall" - this sentence started spreading in the corridors of NeurIPS in the second half of 2024, became the cover story of The Information in early 2025, and then Ilya Sutskever himself said in a public interview that "the era of pre-training as we know it is ending." The entire circle has reached a certain consensus: the marginal utility of the scaling law is decreasing, and the next wave of breakthroughs depends on RL post-training, inference-time compute, and the composite workflow of agents.

Karpathy's choice of pre-training at this point in time is equivalent to betting his reputation as a researcher in the opposite direction. He didn't write a long blog to prove that there is still room for pre-training. He directly expressed his stance through his career choice - and with his credibility in the circle, this stance is more convincing than any paper.

To understand the value of this, we need to know that Karpathy is not an ordinary ML researcher. He is one of the few who can integrate the "dirty work of large-scale training engineering" and the "intuition of the first principles of LLM."

Projects like nanoGPT, which condense GPT training into a few hundred lines of Python, require judgment on the trade-offs at each layer of the training stack. Writing the entire LLM training in C in llm.c is a job that strips away all abstractions. He led an engineering team of dozens of people at Tesla and built an industrial-grade end-to-end training pipeline. There are no more than ten people in the entire industry who "understand both the details and the system, can write papers and control costs."

Anthropic placed him under Nick Joseph instead of giving him a VP title, which is also a very practical gesture - it shows that Karpathy himself is also willing to start from execution.

In the short term, this means there won't be immediate results. The pre-training cycle is long, and it will take at least half a year for him to get used to the new environment. He will really enter the training pipeline of the next-generation Claude in 2027.

But in the medium and long term, this is a super chip that Anthropic has bet on the idea that "there is still a lot of room for pre-training."

Deeper Impact on Anthropic

The talent war is just the surface.

The real significance of this personnel change for Anthropic is to push its research culture narrative to a position that OpenAI can't catch up with in the short term.

In the past year, the narrative focus of OpenAI has been completely taken away by products. ChatGPT has reached 900 million weekly active users, Sora 2 has been launched, Agent SDK, GPT-5 series - these are all excellent commercial achievements, but the cost is that OpenAI is no longer easily regarded as a "research company" by the outside world. Its research story has been overshadowed by the product story, and even internal researchers are complaining about this.

Anthropic has chosen a different path. It doesn't have a consumer-facing phenomenon-level product, doesn't have a breakout press conference like Sora. The awareness of Claude among ordinary users has always been an order of magnitude lower than that of ChatGPT. But it has been holding on to one thing - "We are a science company."

Dario talks much more about mechanistic interpretability and the training philosophy of Claude than about revenue in public interviews.

The biggest shortcoming of this narrative is the lack of a top researcher who is not from Anthropic to endorse it. You can't let Dario keep talking about it by himself; it would seem like self-promotion.

Karpathy fills this gap.

In the imagination of the "research culture" in both Chinese and English-speaking worlds, he is the greatest common denominator. nanoGPT is open source, CS231n is free, and his YouTube courses are for everyone - he is almost the synonym of the "top researcher who works out of love" in the past decade. This brand asset cannot be replaced by any internal employee of Anthropic because it has been accumulated over ten years.

In the next year, when Anthropic is raising funds, recruiting, and telling stories to enterprise customers, "Karpathy is in our pre-training team" will be repeatedly mentioned. It brings not only the output of a researcher but also a recalibration of the narrative about "what kind of company Anthropic is."

In the Chinese circle, the amplification effect of this will be even stronger. Anthropic's position in the Chinese market has always been awkward - its products are not available, but it is widely read in the developer culture. Karpathy's joining will make the Chinese developer community further tilt their attention, and this is exactly the pool that OpenAI has long ignored.

Of course, it's really hard for us to expect Anthropic to favor the Chinese community.

OpenAI's "Founder Curse"

Karpathy's job-hopping this time will bring an issue that OpenAI doesn't want to discuss again to the surface: How many of the 11 founding members of OpenAI are still in the company?

Let's count: Ilya Sutskever went to Safe Superintelligence, Mira Murati (although not a founding member, but an early core member) led the establishment of Thinking Machines, John Schulman went to Anthropic and then jumped to Thinking Machines, Wojciech Zaremba is still there, Sam Altman is still there, and the rest have either become silent or left long ago.

Karpathy left once eight years ago. This time going to Anthropic doesn't count as a second departure, but the media framework will definitely connect this with the line of "the outflow of OpenAI founders."

What's more irritating is that he chose Anthropic instead of going back to OpenAI, not going to SSI, not supporting Thinking Machines, and not going to xAI. Among all the possible destinations, he chose the direct competitor of OpenAI.

How Sam responds - or doesn't respond - this time is a window worth watching.

A bigger question is: Can top researchers still start their own businesses?

There is a signal in Karpathy's own experience that is not friendly to all researchers who want to "fork an AI company."

He left OpenAI for the second time to start Eureka Labs. He made it very clear about the direction of this company - to reinvent education with LLM. He wrote LLM 101, conducted more than a year of product and course experiments, didn't raise a large round of financing, and no phenomenon-level product emerged. Then he chose to return to a large company.

Looking at this together with other cases in the past one or two years: Mira Murati's Thinking Machines has a rumored valuation of $10 billion but no product yet; Ilya's SSI has an even higher valuation but is completely silent to the outside world; David Luan's Adept was sold to Amazon; Inflection was acquired by Microsoft in its entirety.

A looming structural problem is emerging: When the computing power moat gets deeper every six months, when large companies can offer eight-figure dollar salary packages, and when the sense of the main battlefield is concentrated in the hands of three to five companies, is the "optimal solution for entrepreneurship" for top AI researchers gone?

Karpathy voted with his feet. He didn't fail in his entrepreneurship - Eureka is still open, and he didn't run out of financing - he just judged that at the current stage, the biggest leverage point in AI is the model itself, not popularization, not education, and not any peripheral aspects. He wants to return to that leverage point.

For all top researchers who are considering "whether to spin off and start their own businesses," this signal is cold.

What Has Been Ignored

Let's go back to the last sentence in his original tweet, the one that almost all reports haven't elaborated on - "I remain deeply passionate about education and plan to resume my work on it in time." "I'm still passionate about education and will return to this at the right time."

On the surface, this sentence leaves a decent ending for Eureka, but if you understand it, you'll find that this is the most important sentence in the whole tweet.

Because its subtext is: The current time is not the right time.

A researcher who has personally worked on AI education for more than a year publicly admits that now is not the time to do this. This is a judgment that all entrepreneurs in the "AI + education" track don't want to read but will be hurt if they do - Is AI strong enough to reshape education? No. Is the education market ready to be reshaped by LLM? Also no. Between these two "no's," he chose to return to the upstream and push the model itself forward first.

This is the most honest judgment of the current stage by a top researcher: The most leveraged position in the next few years is to push the LLM itself forward. Everything else - education, agents, applications, products - has to wait for the model itself to cross the next step.

If he is right, the next-generation Claude will prove him right. If he is wrong, the whole circle will collectively breathe a sigh of relief because it means that pre-training has really hit a wall, and everyone's bets on RL and agents are correct. The track pattern will be more fragmented, and there will be more opportunities.

But Karpathy has shown us with his judgment in the past decade that he is not the kind of researcher who makes mistakes often.

And Anthropic has just signed him.

This article is from the WeChat official account “AI Disagreement”, author: Joshua, published by 36Kr with authorization.