Share a Conversation with the Head of Claude Code

About Verification, Quality, Team, Planning and the Loneliness of Engineers

Today, I listened to a podcast. The guest was Fiona Fung, the person in charge of the Anthropic Claude Code and Cowork teams.

Most people are still arguing about "whether AI can write code" and "whether it will replace engineers." This lady directly skipped these questions and talked about what comes after that.

Fiona has been an engineer for 25 years and has experienced several moments of "everything has changed." Because she survived the previous rounds, her judgment on this wave of AI is on a completely different level from that of a junior engineer who has only been in the industry for two years.

Since the content is long and a bit scattered, I reorganized it after understanding it. At the beginning of the podcast, she put forward a view: Coding is no longer the bottleneck.

A piece of data:

On average, Anthropic engineers deliver eight times as much code per quarter as they did between 2021 and 2025. The trend in the chart is stable, stable, stable, and then it shoots up in a straight line.

The explosion in code volume is just the surface. What really interests her is another thing: The people submitting code have changed.

In the Claude Code team, designers are submitting code, product managers are submitting code, and almost everyone is checking in code.

In the past, "writing code" was an exclusive action of engineers. Now it has become a basic cross - role ability, as common as sending an email.

Fiona's exact words:

The question becomes, where has the bottleneck shifted? Not only are more people submitting code, but they also come from different disciplines, and the throughput has become very high. So how do we verify? How do we confirm that the things generated at high speed are correct and of good quality?

This is a very subtle shift.

In the past, the core constraint of engineering teams was called "engineering bandwidth." Writing code was expensive, and time was scarce. Therefore, people made a lot of plans around this constraint to ensure that limited engineering resources were spent on the most worthwhile things.

Now this constraint has disappeared. Code can be generated cheaply and quickly.

But the constraint hasn't really disappeared. It has just changed its position and shifted to relatively secondary links in the past, such as verification, review, and quality assurance.

Comparing it with her experience at Microsoft makes it very clear:

In the era of Visual Studio, software was burned on CDs, and the deadline was truly non - postponable: The software had to be sent for pressing, boxing, and shelving. Engineering time was extremely precious, and team planning had to be optimized to the extreme.

Later, with online releases, the delivery method changed once. Today, with AI significantly accelerating the coding process, it has changed again.

The patterns of these three rounds are actually the same: The old bottleneck is broken through, and a new bottleneck emerges.

Lenny told a story during the conversation. He knows an engineer who, in the past, when hearing a functional requirement, would first think, "It's too difficult, too complicated. I can't do it."

Now his reaction has completely changed: "This can definitely be done. Just let Claude Code handle it."

Fiona also told a similar story. There is an engineer on her team who didn't work on mobile applications before. A function needed to be extended to the mobile end. In the past, this would have been a dead - end because he wasn't an Android expert. Now, with Claude as a partner, he can make progress even in unfamiliar technical fields.

Fiona said: This indeed raises the upper limit of what each person can do.

This sentence seems simple, but it points to something significant.

In the past, what limited an engineer's output was technical complexity: If you didn't know how to do something, you couldn't do it. Now, what limits output has changed. It has become your judgment and ambition.

Theoretically, a lot of things can be done. The key is:

What do you choose to do, and how do you verify that you've done it right? "How fast you can write" is no longer the problem; whether you write correctly is.

This is the overall picture that Fiona sees: An explosion in code volume, blurred role boundaries, and an elevated upper limit of capabilities, all happening simultaneously.

It sounds like all good news, right?

Behind the good news are a bunch of new problems: Who will guarantee the quality? AI can help you do anything, but how do you know it's done it right? The team's output speed has increased eight times. Can the old management methods, quality frameworks, and planning rhythms still be used?

Fiona's daily work at Anthropic is to answer these questions.

......

The first change she made was to deploy a Claude Code remote session in all repositories.

This instance can access all the team's repos, Slack channels, and metric dashboards. To put it simply, she has a god - like view and can see at any time what everyone is doing and how they're doing it.

In the past, her mornings were like this:

She would hold a cup of coffee, open various feedback channels, and read one by one what users had said and what internal colleagues had proposed. If she had time, she would handle a few problems casually.

Now, the whole process has changed.

About a month or two ago, Claude Code launched Routines, and her entire workflow was rewritten.

Fiona herself said: "In the past, I had to write prompts myself. Now, with Routines, it's like having an Agent help me generate prompts and even generate PRs.

For example: I set up a Routine to monitor the feedback channel every morning and automatically extract the themes. When I wake up, a feedback summary has already been generated, and there are also a few PRs waiting for me to review."

Lenny asked her: As a manager, in the past, when you saw a problem, you would send someone to fix it. Now, Claude has already come up with a solution, and you just need to review whether to merge it?

Fiona said yes, and it's more than that. If the verification is done well enough, you can even give the Agent more autonomy to execute directly.

Have you noticed that there is a very interesting role transition here?

In the past, you wrote instructions yourself and sat there waiting for the results. Later, you sent out several instructions at once and didn't have to wait idly. You could collect the results later.

Now, it has gone a step further: You set up a fixed process. This process helps you write instructions and then assigns tasks to different AI assistants. Your role has changed from "doing it yourself" to "reviewing after the fact." Taking another step forward, you become the person who builds the system.

Fiona herself is very honest:

In the past, she had to set aside large chunks of time to write code. Now, she has to set aside large chunks of time to digest all the work done asynchronously by AI assistants.

To put it simply, in the past, her brain was tired from writing code. Now, her brain is tired from constantly switching contexts and checking one by one. This is a brand - new burden of context switching.

......

The manager's workflow has been rewritten, and efficiency has been maximized. Immediately, a question arises: Things are being produced faster, but who will guarantee the quality?

Fiona talked about several approaches in this regard, some of which are very strict, and some are quite unconventional.

Let's start with the strict one. The first approach is automated code review. She said a sentence: Think about it. Last year, we didn't even have the function of Claude Code automatically reviewing code. In the past, human reviewers were a huge bottleneck.

Of course, in places that require in - depth professional knowledge, humans still have to do it. Many check items that can be standardized can now be completely handed over to Claude.

Her advice is very specific:

If you have a set of definitions of "what is good," whether it's design specifications, code styles, or architectural principles, write them down directly and put them in the code repository to create a rules file.

After Claude gets a clear framework, it performs very well in doing verification according to the rules.

There is only one prerequisite: This specification document has to be updated synchronously with the code. If you write it and leave it aside, it will be an expired piece of waste paper, which is worse than having nothing.

The second approach is test - driven development, known as TDD in the industry. To put it simply, write test cases first and then write code. Set the standards you want to achieve first and then start working.

The first bug she fixed in Claude Code, she let the AI follow this process: Write the test first to ensure that the test fails, and then fix the code to make the test pass.

Her exact words:

TDD was very popular in the 2000s, and the theory is good. At that time, I struggled a bit. I felt like I was being forced to eat broccoli first. What I really wanted to do was build a product and launch it.

Now that test generation has been automated, those correct but unappealing practices from the past have suddenly come back to life.

I like this metaphor; broccoli is healthy but not delicious, and testing is important but annoying to write. Now, AI chews the broccoli for you, and you just enjoy the ready - made results.

The third approach is a judgment framework she came up with herself, called Bad vs Sad. The English name doesn't matter. Just remember the meaning.

Bad refers to serious and irrecoverable errors. For example, while you're using the command - line interface, it suddenly crashes, and all the work you've done is lost. This is Bad.

Sad refers to problems that are a bit painful but can be recovered from. For example, the interface flashes for a moment, which is annoying, but no data is lost, and you can continue working without a problem.

Each product is different, and each team has to define for itself: What counts as Bad and what counts as Sad in your case.

Here is a very key insight: If you accumulate enough Sad, it will turn into Bad.

Fiona said: In the past, we had too many monitoring panels, and it was difficult to step back and see the big picture. Therefore, instead of staring at a bunch of raw performance data, it's better to have a judgment framework first to help you distinguish between a truly bad experience and an uncomfortable but tolerable one.

Make sure you're solving the real Bad problems, and at the same time, keep an eye on the trend of Sad to prevent it from getting worse.

The last approach is a bit unconventional. In September last year, an engineer on the team proposed: We should track the frequency of users swearing.

Fiona thought it was a good idea. Sounds absurd, right? This indicator later really became a way for the team to monitor users' emotions. When the frequency of swearing goes up, it means the product is driving users crazy somewhere.

It can't replace proper performance indicators, but it provides a completely different perspective: When users are using your product, are they happy or cursing?

We have a quality framework and detection methods. Who will execute them? What kind of people can survive and thrive at this speed?

......

Fiona is currently recruiting two types of people.

The first type, she calls dreamers. Translated, they are creative players with a sense of product.

These people have an idea in their minds, start working on it by themselves, and then keep an eye on the feedback. If it's not good, they make changes and keep refining it until the experience is satisfactory. They take full responsibility for the entire product from the idea to the implementation.

The second type is in - depth system experts.

When she first joined the Claude Code team, she found that the team had very good all - around product people, but there was a lack of one type of person: Engineers who really understand the underlying systems and distributed architectures. No matter how powerful the model is, in many places, someone still needs to understand what's really going on at the bottom to make reliable judgments.

Just having the right people is not enough. She repeatedly mentions a pairing concept: High initiative paired with high responsibility.

The original words are agency and accountability. You don't need to remember them. Just remember the meaning:

Everyone on the team can have their own ideas to solve problems. This is called high initiative. The prerequisite is that you have to answer two questions: What problem are you going to solve? What are your assumptions? You can't just be impulsive without giving an explanation.

To put it simply, you can go all out, but you have to take responsibility for the results.

Then she has a practice that I think is quite strict: All new managers must first go back to the front - line to write code and immerse themselves in the codebase for at least a quarter to maintain a feel for the product.

Her logic is very simple:

The speed of change in Cowork and Claude Code is measured in weeks. As a manager, if you don't use the product with your own hands every day, you will quickly lose your judgment. By then, when you look at the data and listen to the reports, you won't really have confidence.

She went through this herself. She managed 500 people at Instagram and then went back to the front - line at Anthropic. The last time she released production code was in 2017, and she hadn't touched it for many years in between.

In the first week of joining the team, she almost fell back into the old management routine: Inviting engineers one by one for coffee and having a round of listening sessions. Then she stopped.

"Let me ask Claude first. It helps me get familiar with the codebase, run automated tests, and even design manual test plans. This gives me the confidence to deliver code again."

Later, many friends who hadn't written code for a long time told her the same thing: Thanks to Claude, they started delivering code again.

One thing about her has never changed: She insists on eating her own dog food. When working on VR, she didn't submit code in that codebase for fear of messing up the operating system. She spent a lot of time using the product herself and looking for problems.

She said that every time she does this, the team members are very grateful. Because in addition to the cold numbers, everyone needs to feel that their work is being seen.

If the leader is using the product made by the team, the team will feel that this person is still on their side, rather than just staring at the reports.

The right people are in place. But what do these people do every day, and how do you prioritize their tasks? Can the old management and scheduling ideas for product development still work? The answer is: No.

......

When Fiona first joined the Claude Code team, she tried to create a lightweight six - month roadmap. Three months later, she found that no one was looking at it anymore. Her conclusion was straightforward: The six - month roadmap is no longer useful. Just scrap it.

Now, the team is doing what's called just - in - time planning. In plain language, it means: Only plan for one month. It's very lightweight, and there isn't even a formal document. Everyone just checks the important things in a spreadsheet.

It's not that they don't think about the long - term at all. There is a theme alignment session every six months, where the whole team sets a few major directions together. What specific functions to implement and how to implement them are all decided within the one - month window.

She also mentioned a detail:

She is even thinking about how to automate this monthly spreadsheet. The reason is very practical: She doesn't want "updating the spreadsheet" to become a new burden.

Damn it. The spreadsheet is supposed to help you work, not become something you have to serve.

Moreover, regarding processes, her attitude is even more straightforward: If it's useless, get rid of it.

该文观点仅代表作者本人，36氪平台仅提供信息存储空间服务。

Share a conversation with the head of Claude Code