AI Talent War: Fresh PhD Graduates Offered 5 - Million

In the AI era, just stocks are not enough; tokens are also needed.

"A large company hired dozens of fresh doctoral graduates with an annual salary of over three million last year. They were competing for the top ones with offers of five to six million, which is even more exaggerated than Huawei's 'Genius Youth' program back then."

Xiao Mafeng, the founder of the AI talent recruitment platform TTC, told Pencil News that HRs from large companies camped out at several top universities to recruit young people in batches for AI projects.

Meta offered a signing bonus of $100 million to poach researchers from OpenAI, not including the salary.

Not long ago, Lin Junyang, the technical leader of Qwen, left Alibaba. The media's attention to this event was like the non - stop reports on star player transfers in the sports front pages of major newspapers in late August every year.

Today, top AI researchers are being scrambled for and attracting attention just like top football stars. It's the first time that scientists have become celebrities. They have also earned far more wealth than their predecessors.

01 Poaching with $2.7 billion

In August 2024, Google spent $2.7 billion to acquire a chatbot company, Character.AI.

The valuation of this company was about $1 billion.

Google was willing to pay a premium not because the company had any super patents, but to recruit two people.

The company's founders, Noam Shazeer and Daniel De Freitas.

Google's acquisition condition was for them to pack up and return to Google to work immediately.

Especially Shazeer, he is an unavoidable figure in the development of large models.

In 2017, as a core participant, he wrote the famous paper "Attention Is All You Need". This paper proposed a brand - new model structure (Transformer).

Today, almost all the large models you've heard of: GPT, Gemini, Claude are all based on this structure.

In 2021, he left Google in a huff and founded Character.AI because Google was worried about the risks and didn't allow him to release the chatbot Meena.

Three years later, Google bought him back, and he became one of the three core leaders of Google DeepMind, leading the development of the new - generation Gemini model.

It's obvious that after his return, starting from Gemini 2.5 Pro in 2025, Google's models have been competing with GPT on an equal footing.

It's not to say that all this is due to Shazeer alone. It's to say that for today's AI large - model development, an academic soul is likely to raise its upper limit. Friends who often watch sports should understand this well.

02 Increasing the Winning Rate

"You can't regard people like Lin Junyang as traditional engineers. Maybe with just one enter key press, whether it succeeds or not, tens of millions of dollars could be gone."

Wang Tiezhen, the head of Hugging Face in China and a senior engineer, told Pencil News that the value of top AI researchers lies not in execution but in decision - making.

Training a large model, starting from model selection (A or B), architecture determination, and data selection, costs money at every step, often requiring tens of millions of dollars in investment. Decisions are often made by one person. If the decision is right, the model will be developed; if wrong, the money may be wasted.

"Most engineers have only trained small models, have no experience in ultra - large - scale training, and have no complete experience from 0 to 1. While top talents know how to call data, have truly completed large - model training, and have experience in RL (Reinforcement Learning), alignment, and inference models. They know whether a route can work, and this kind of experience can only be gained through practice."

In addition to experience, sensitivity to technology is also exclusive to a few people. Wang Tiezhen explained: "If a person doesn't have a clear judgment about the future, their decision - making will be limited. Even when many key signals are in front of them, they may not be able to seize them. For example, when Dario Amodei was at Baidu, he noticed intelligent emergence and the Scaling Law. This was later published publicly, but the problem is that others didn't have that sensitivity and didn't know what it represented. Those who know the value can find enough resources to turn this prediction into reality."

You know, compared with the past Internet, AI is a hard - technology field with heavy - asset investment. The money spent on buying GPUs and computing power is huge. Meta is estimated to have 600,000 NVIDIA GPUs, Microsoft has 300,000 to 500,000, and Google also has hundreds of thousands. These major giants' annual capital expenditure on this is at least $30 billion. The industry estimates that the single - training cost of a GPT - 4 - level model is hundreds of millions of dollars.

For this kind of "heavy - duty equipment", whom can you trust with it? The saying that one Qian Xuesen is worth five divisions is exactly the same meaning here. No matter how high the salary of top AI researchers is, it's just a small amount compared with the cost of buying cards and cloud computing power.

Why wasn't a company so reliant on a few stars in the traditional Internet era as it is today? Wang Tiezhen explained that Internet products essentially rely on the engineering system and user scale to succeed. They can adjust the product while running user data. Training a large model is "like making a nuclear bomb. There aren't infinite experiments. With only a limited number of attempts, you must find the people you can trust the most."

Xiao Mafeng also saw a similar situation when he was a headhunter for the real - estate industry. "Some real - estate company bosses offer annual salaries of tens of millions of dollars (to recruit executives), and it's in cash. People think it's so exaggerated. But for them, human resources aren't the biggest cost. The biggest cost is capital. A project may involve tens of billions of dollars. With such a large investment, the requirement for the winning rate is very high, so finding the right people is very important."

Still from "Moneyball"

In the cruel competition of large models, betting on star researchers is also betting on the winning rate.

03 Star Effect

"People like Lin Junyang are the starting point for building a team. How much does it cost to buy hardware and GPUs? All investments expect returns, which means there will be people to pay later. For a project with a large - scale investment, the sunk cost is very high, and future investment is also uncertain. With a star around, those who are willing to pay a high price later are more likely to invest. Technically, there is already replicable experience."

A technical leader from a large company told Pencil News that the experience of star researchers is like an endorsement for the enterprise.

There is a rumor. A hardware manufacturer A poached a large - model expert from a well - known large - model company. Coincidentally, company A was seeking financing. After the expert joined, the financing went smoothly, and soon it announced a new round of financing of tens of millions of dollars. However, the expert stayed for a short time and then went to another giant enterprise. Although it can't be confirmed that company A poached the expert for financing, during the expert's stay, company A did get real money from investors at the fastest speed.

"Veterans in the industry = Endorsement for valuation", which is also true in other industries. According to a report in Caijing magazine, the valuation of a startup project founded by a former DJI employee tripled in three months. Some business leaders from DJI can get 20 million in financing as soon as they decide to leave and start a business, even before they determine the specific direction.

In addition to increasing the valuation, the "star effect" of top AI researchers is becoming one of the most important brand assets of a company.

Wang Tiezhen gave an example. In the past, Chinese technology products were often regarded as low - cost alternatives. In the AI field, because Chinese open - source models have been at the top of the global open - source circle, local researchers have become part of the global technical discourse. At the 2025 NeurIPS conference, Lin Junyang gave a speech.

"These people's papers are all available, and peers around the world can see them and remember their names. If a company wants to expand its global business, it can just bring people like Lin Junyang to meet the executives in various enterprises who are doing AI research. Many people actually want to communicate with them. The influence of local researchers on the world is unprecedented."

In this situation, the relationship between researchers and enterprises is also being re - examined.

People come not only for money but also for their ideals. Top AI talents may not be attracted by just salary or even the once - attractive stock options in the Internet era. Computing power has become the latest bargaining chip. Just as after Lin Junyang left, there was a discussion within the company about the allocation of computing - power resources.

According to Pencil News, the founder of a large company in the north personally negotiated with an AI star researcher. Of course, the money offered was not small, and another important condition he offered was to promise a certain amount of computing - power resources for the researcher to allocate.

In the AI era, just offering stocks is not enough; tokens are also needed.

This article is from the WeChat official account "Pencil News" (ID: pencilnews), written by "Honest One", and is published by 36Kr with authorization.

该文观点仅代表作者本人，36氪平台仅提供信息存储空间服务。

The Talent War in AI: Fresh PhD Graduates Offered Annual Salaries of 5 Million Yuan

01 Poaching with $2.7 billion

02 Increasing the Winning Rate

03 Star Effect