HomeArticle

A big shot at OpenAI revealed: An undergraduate student got into OpenAI with just one blog post. No Ph.D., zero published papers.

新智元2026-02-25 19:10
These people have all successfully achieved a counterattack against large companies.

Without a doctorate or published papers, he impressed industry giants and landed a job at OpenAI by publicly improving papers and running benchmark tests! Noam Brown confirms: Initiative and open - source projects are the real tickets to top AI labs.

Recently, an article by Noam Brown, a legendary researcher at OpenAI and the father of Texas Hold'em AI, has gone viral.

Is it possible to get a job at a top - tier AI lab without a doctorate or research background?

This may sound like a fantasy, but the wonderful thing about the world is that there are actually quite a few such examples.

For instance, a young man named Keller Jordan managed to land a job as a machine - learning researcher at OpenAI just by publishing an open - source blog!

Yes, instead of writing a paper, he fully open - sourced the entire research process, code, and experimental results on GitHub.

Finally, Noam Brown concluded: Although the space for open research is smaller than before, improving existing papers is still an excellent way to prove your abilities to lab researchers!

This approach will also give the other party more confidence and help you secure an interview.

Starting from AI content moderation and reaching the pinnacle of life

In 2020, Keller graduated from UCSD with a double bachelor's degree in mathematics and computer science.

Upon graduation, he had never published any papers.

His first job was at an artificial - intelligence content - moderation startup.

One day, he read a paper recently published by Behnam, a top Google researcher, and came up with an improvement idea. So he sent an email to Behnam.

After reading the email, Behnam agreed to mentor this young man. Without any connections or background, the young man managed to connect with the industry giant.

Even more amazingly, this collaboration eventually led to a paper published at ICLR.

Later, Keller's outstanding work, "NanoGPT speed run", completely changed the research paradigm. It not only impressed Karpathy, the head of AI at Tesla, but also caught the attention of OpenAI.

This wasn't a traditional paper, but it became a turning point in Keller's life.

Since all his work was well - documented, with quantifiable results and clear progress, OpenAI didn't hesitate to offer him a position.

Impressing Karpathy with a "Well - done!"

NanoGPT is an open - source project by Karpathy, a minimalist and lightweight framework for GPT training and fine - tuning.

One of Keller's favorite things to do was to continuously improve the training speed of NanoGPT. To achieve this, he kept trying new methods.

In October 2024, he achieved a result that increased the token efficiency of training the Transformer model by 3.8 times!

This directly earned him high praise from Karpathy.

The goal of the NanoGPT speedrun sounds very simple: Train the model with as few tokens and in as short a time as possible, given a fixed model size (124M Transformer) and a fixed validation - set loss target (3.28 val loss).

What Keller did was to transform Karpathy's nanoGPT/llm.c PyTorch training code into a reproducible, quantifiable, and comparable benchmark.

Ultimately, he increased the token efficiency by 3.8 times and reduced the number of tokens required from about 10B to 2.7B to reach the target loss.

This means that this improvement can be strictly verified and is a hard metric.

Making experiments affordable for everyone

Moreover, Keller is very innovative.

Unlike many training processes that require hundreds of thousands or even millions in computing power costs, when designing this speedrun, he had a very clear principle: Keep the cost of trying new ideas as low as possible.

To achieve this, he took several measures. For example, he compressed the code to a minimum of 537 lines; in a new 8×H100 environment, the installation and running time was only 20 minutes; and the cost of a single attempt was as low as $8.

Even in today's AI research environment, this is an extremely rare design choice.

This means that from now on, it's not just large labs that can participate. All individual researchers, students, and independent engineers can quickly verify their ideas, and innovation will no longer be blocked by the computing - power threshold.

Catching OpenAI's attention

In this way, the NanoGPT speedrun became a crucial part of Keller's journey to success.

Everything indicates that this result is very solid: The code, logs, and experiments are all fully reproducible; the metrics are impossible to cheat on; and there is even real participation from the development community.

Even the verification method is designed to be extremely rigorous: Each log file of the speedrun contains a complete copy of the code.

Anyone who wants to reproduce a new record just needs to call the log file.

The emergence of Muon

Then, the story reached its climax.

At the end of 2024, Muon, an optimizer for the hidden layers of neural networks designed by him, emerged and directly broke the world records for the training speed of NanoGPT and CIFAR - 10 with its excellent performance!

Muon is an optimizer designed for the 2D parameter hidden layers of neural networks. Its core idea is to orthogonalize the update matrix generated by the SGD - momentum method through Newton - Schulz iteration to generate an update close to a semi - orthogonal matrix, thereby improving training efficiency.

It is simple and efficient to implement, supports stable operation at bf16 precision, and significantly reduces computational overhead.

Compared with the AdamW optimizer, Muon performs amazingly in multiple tasks.

Although AdamW can make GPT, LLaMA, and Qwen learn stably and quickly, as the model parameters increase from hundreds of millions to hundreds of billions and the training time changes from days to weeks or even months, the limitations of AdamW begin to show.

Although it has not yet become a mainstream general - purpose optimizer, the emergence of Muon indicates that it is likely to be a major fundamental innovation in the field of AI model training.

Joining OpenAI

As Muon's influence in the developer community grew, Keller officially joined OpenAI in December 2024.

Interestingly, Keller said in February that although Muon had become popular and helped him get into OpenAI, he wouldn't write a paper about it.

In his view, instead of publishing a paper on arXiv that would likely be "buried", it's better to continue to honestly research his optimizer.

After all, in his opinion, most optimizer papers are just superficial and unsubstantiated.

These people have successfully achieved career turnarounds at big companies

In addition, Noam Brown also listed other successful cases.

For example, Sholto Douglas, who was discovered by Google DeepMind.

He is very low - key on X. He has never published any eye - catching papers as the first author, and he has only been in the industry for a year and a half. However, he is the key figure behind the success of Gemini.

While working at McKinsey, Sholto became convinced that AI would boom, so he started his own projects in his spare time and asked many insightful questions on Jax's GitHub.

These actions impressed James Bradbury, and he was finally invited to interview at Google DeepMind.

Andy Jones is a semi - retired quantitative analyst. Before the popularity of test - time computation, he wrote a paper comparing the impacts of scaling up pre - training and increasing test - time computation.

This paper was extremely impressive not because it broke a benchmark, but because of its very smart design choices. He wrote a GPU - accelerated environment himself and conducted rigorous and detailed ablation experiments.

Finally, Andy Jones joined Anthropic.

References: 

https://x.com/polynoamial/status/2014084431062114744 

https://x.com/polynoamial/status/2014084432685326485 

https://x.com/polynoamial/status/2014084509575291163 

This article is from the WeChat official account "New Intelligence Yuan". Author: New Intelligence Yuan. Republished by 36Kr with permission.