Recommended by Lin Junyang, Anthropic Researchers' Confessions: How to Become an Excellent Researcher?
In the field of AI, being a researcher is both an identity and an illusion.
Many people think they are conducting research, but in fact, they are just chasing papers, following hot trends, and attending major tech companies' press conferences... They seem busy but are actually standing still.
Recently, Vivek Nair, a researcher at Anthropic, published a long post on X, sharing his insights on how to become an excellent researcher.
Original post link: https://x.com/itsreallyvivek/status/2064686372737454155
He pointed out that true research ability is never developed by chasing hot trends. Instead, it is built upon a series of small skills that can be deliberately trained: how to choose a topic, how to read literature, how to write, and how to accelerate the experimental cycle. Each of these aspects has specific methods, and each has its real challenges.
This article contains no empty talk but only practical advice. After reading it, you may feel a bit stung because it addresses issues that most of us have never seriously considered.
This article has attracted wide attention, and many researchers have participated in the discussion.
Lin Junyang, the former person - in - charge of Qwen, also reposted and shared this article.
Now, let's take a detailed look at this article:
Really, no one teaches you how to do research. You'll be given a desk, a problem selected by someone else, and a vague instruction to produce something novel. So, most people reverse - engineer this job based on what they can see (i.e., papers, posts, and announcements).
What they end up learning is how to look like a researcher, not how to truly become one.
True research ability is built by stacking a bunch of smaller skills, almost all of which can be deliberately trained.
Choose Your Own Problems
Richard Hamming had a habit at Bell Labs that made him unpopular at lunch. He would ask those sitting near him what the important problems in their fields were. Then he'd ask why they weren't working on those problems. So, people would change tables.
Richard Hamming (1915 - 1998) was an American mathematician and a pioneer in computer science, who worked at Bell Labs for a long time. His most well - known contributions are the Hamming Code and the Hamming Distance, which laid an important foundation for modern error - correcting coding and digital communication, enabling computers and communication systems to detect and correct errors in data transmission. Besides technical research, Hamming was also famous for his thoughts on scientific research methodology. His speech "You and Your Research" is still widely circulated today and is regarded by many scientists and engineers as a classic work on how to conduct important research.
This question is stinging because most of us can't come up with good answers. We don't choose problems; we absorb them. We absorb problems from our supervisors, from what a big lab announced last quarter, and from the paper that everyone is forwarding and citing this week.
The trouble with absorbed problems is that you only know the conclusions but lack the reasoning process.
You know that a famous lab is interested in a certain direction. You don't know why, what they expect to discover, or what would make them abandon this direction. You'll find out a year later when they change their research direction. And on a popular problem, you're competing with a thousand people who started earlier and have more computing power than you.
John Schulman's guide to ML research divides the work into two modes.
http://joschu.net/blog/opinionated-guide-ml-research.html
In the first mode, you read the literature and look for areas to improve. In the other mode, you choose a result you truly hope exists and then reason backwards to figure out the required experiments.
He advocates the second approach. The underlying reason is that this way can create originality. A goal you truly care about will drag you into areas that no review paper covers.
Meanwhile, taste is often discussed as if it were a gift. But it behaves more like a muscle. Before running each experiment, predict its results. Cover the results section of a paper and guess the data based only on the methods. Note down what published this month will still be important in two years and check your prediction accuracy later. One prediction followed by one correction, repeated hundreds of times, is how every good model is trained. The model in your brain is no exception.
Upgrade Your Input
Shared reading lists lead to shared ideas. If your information sources are the arXiv trend page and the content that survives group - chat filtering, you'll inevitably reach the same conclusions as others at the same time. This makes these conclusions almost worthless.
The value of old materials is seriously underestimated. This field always repeats its past with a delay: Mixture of Experts models date back to 1991, LSTM to 1997, and backpropagation became mainstream in 1986.
Rich Sutton wrote about a thousand words in 2019 on the "Bitter Lesson". It can predict the development outline of the field more accurately than a review article ten times its length.
http://www.incompleteideas.net/IncIdeas/BitterLesson.html
Claude Shannon gave a speech on Creative Thinking in 1952. His opening move was to shrink a problem to an almost trivial level, solve this minimized version, and then gradually introduce the difficulty. This technique will help you break through obstacles far better than any modern productivity advice.
Shannon
The breadth of knowledge is as important as its depth. Interpretability freely borrows from neuroscience. Evaluation design is just mechanism design in a lab - coat. If you have a practical understanding of how GPUs actually move memory, you can tell which architecture papers are doomed to fail before the benchmark results come out. Moreover, honest statistics may be the rarest skill in ML. Here, many so - called rigorous studies published are just a false atmosphere with error bars.
One more thing. Read the papers themselves, not the posts summarizing them. The appendix is where the key details are really hidden. And the limitations section is usually the most honest part of the whole document.
Write Everything Down
Paul Graham pointed out that an idea seems fully formed until you try to express it in words. Writing on paper will reveal the loopholes your brain covers up. For example, assumptions you've never tested, steps that actually lack coherence, and two claims that secretly contradict each other.
Feynman's Rule is that the first person you must avoid fooling is yourself because you're the easiest target. Writing is the cheapest defense ever invented.
Feynman
Darwin went further and made it a routine. Any fact that contradicted his theory would be written down on the spot. Because he found that his memory deleted unfavorable evidence much faster than favorable evidence.
Your memory will do the same thing to your failed experiments. Keep a record: assumptions, settings, expectations, results, and updated knowledge. Rereading last month's records will humble you more than any reviewer can.
Then make some of this content public. The article by Orla and Carter on research debt suggests that all fields are suffocating from unassimilated ideas. A clear explanation is not only a service but a real contribution. Many people working on interpretability research today discovered this field through easy - to - read posts, not through conference papers. A large amount of public writing can also serve as your strongest qualification because it is an unforgeable sample of your thinking method.
Tighten the Feedback Loop
Stories about Alec Radford rarely involve a single stroke of genius. These stories are often about quantity. More runs per day, discarding more wrong ideas per week, and a reality model that updates faster than anyone else's. This is the real game. The speed of research mainly depends on how quickly you find your mistakes.
This makes developing tools a top - level scientific research activity. Starting a run should only require one command. Plotting the results should only require one more command. Every experiment should be reproducible from its configuration file. Comparing two runs should only take a few seconds, and definitely not an afternoon of rummaging through historical records.
One step in Karpathy's secret to training neural networks pays a hundred times more than the input: Overfit on a single batch of data before large - scale training. In just 30 seconds, half of your bugs will disappear. Shrink everything until it's cheap, get everything right, and then consume computing power.
Karpathy
Also, abandon the idea that engineering is just a secondary role here. At the forefront, these two jobs have merged. Researchers who can build test frameworks, evaluation mechanisms, and data pipelines are the ones whose hypotheses can actually be tested. Everyone else is waiting in line.
Keep an Eye on the Output
A decreasing loss curve is not an analysis. It's just a comfort. Your experiments release far more information than you consume. For example, records, failure cases, and strange tails in the distribution. Most of this information is unread and dies in the log folder.
Karpathy's secret starts before writing any training code. He'll spend hours manually processing the raw data. Most ML bugs are in the data, and they fail silently. Nothing crashes. You just get a mediocre model and a wrong theory about why.
Andrew Ng has been teaching the same unspectacular trick for more than a decade because nothing can beat it. Pick out a hundred failure cases and look at them all. Categorize them and then focus on the largest category. It works for models and evaluation mechanisms. If you've never read the record text of a benchmark test, you don't really understand the benchmark test. A record text of truly strange behavior will teach you much more than an extra decimal place of accuracy.
Roam with a Purpose
Your first sub - field is just a matter of chance in time, so accept this fact. Before deciding on the field you'll specialize in, spend some real effort to understand interpretability, evaluation, RL, and system directions. In some corner of this field, your unique quirks will become an unfair advantage. The only way to find this corner is to pay your dues in several different places. No one can avoid this.
First, run a throw - away version of each idea and let most of them die early. Adjust your baseline extremely strictly. Because the graveyard of ML is full of results that vanished in the face of a properly adjusted baseline. And reviewers are the worst people for you to realize this. Keep doing ablation experiments until you figure out which component actually brought about the experimental results. Usually, only one component works, and it's often not the one in the title.
Breadth is also a form of insurance. All sub - fields will become saturated. This usually happens after they reach their peak on Twitter. Those who can continue to produce results during these transitions are the ones who are already familiar with neighboring fields.