HomeArticle

400,000 Claude Code Sessions Prove: This Is the Most Valuable Skill in the AI Era

新智元2026-06-17 16:20
Anthropic used 400,000 Claude Code sessions to prove: What can squeeze several times more productivity from AI is not coding skills, but deeper domain expertise.

Can an accountant who has never written a single line of code outperform a professional programmer?

It sounds like a fantasy.

Just yesterday, Anthropic released a heavyweight report, presenting this counter - intuitive answer on the table with 400,000 real conversations— Yes, and the gap is almost negligible.

Under the premise of privacy protection, Anthropic "undercover" analyzed nearly 400,000 Claude Code interaction data generated by about 235,000 users from October 2025 to April 2026.

They dissected each conversation, analyzed what the users did, who made the decisions, and how the results were—ultimately reaching a conclusion that is enough to subvert the entire industry's perception:

What determines the success or failure of an AI programming task is not your coding skills, but how deeply you understand your own field.

In other words, AI programming not only doesn't keep outsiders out, but has become a magic weapon for "experts" in all walks of life.

Now, the average user of Claude Code spends 20 hours a week on this tool—four hours a day for five days a week, which is longer than many people spend with their families.

A soul - searching question follows: Where will the wild growth of such tools lead the future of knowledge workers?

Anthropic's report is the first early signal given with real data.

Humans decide what to build, AI decides how to build it

Let's first look at a set of solid data.

Anthropic built a "decision - attribution classifier" to examine each key decision in each conversation: which are "planning decisions"—what to do, which path to take, and what counts as completion; which are "execution decisions"—which file to modify, what code to write, what language to use, and what commands to run.

Then, they marked one by one whether the decisions were made by humans or Claude.

The result is extremely clear: Humans made about 70% of the planning decisions, and Claude took care of about 80% of the execution decisions.

In a nutshell: Humans decide what to build, and the intelligent agent decides how to build it.

You just need to clearly state what you want, and it will handle all the rest of the dirty work.

Moreover, Claude's behavior pattern changes according to who holds the initiative.

When the user tightly holds the execution decision - making power (making more than 80% of the execution decisions), Claude only makes about 8 actions per round and obediently follows the instructions; while when Claude gets the planning dominance (making more than 80% of the planning decisions), it directly reaches 16 actions per round—the tool runs at full power when the reins are loosened.

This tacit division of labor between humans and machines is very much like a person leading an all - around execution team: You don't need to do the manual work yourself, but you must know how to build the house.

A knowledgeable person's one word is worth five of others'

The most subversive thing is the definition of the word "professionalism" in the report—it has nothing to do with your job title, but is task - specific.

A senior engineer is a novice when asking about Rust for the first time; while an accountant who has never touched Python can be a real expert in this task as long as he can precisely tell Claude which rules must be adhered to for the month - end reconciliation and can spot the boundary cases missed by the AI at a glance.

This is the sharpest insight in this report: Professionalism is not "what tools you can use", but "how deeply you understand the problem itself".

How big is the data gap?

In novice conversations, each instruction only triggers about 5 Claude actions and about 600 - word output; while in expert conversations, the action chain doubles to 12, and the output soars to 5 times—3200 words.

This gap exists stably in each work type and each task value range.

The same AI can generate several times more productivity for a knowledgeable person. The gap lies not in the tool, but in the mind.

Novices are most likely to give up

Who uses it more successfully?

The answer given by the report still points to "being knowledgeable".

Anthropic designed a very strict success evaluation system. They first let the classifier read the entire conversation record to determine whether the user achieved the goal, and then added "hard evidence" verification—there must be verifiable signals such as git commits, passing tests, or explicit user confirmations.

According to this strictest standard: Only 15% of novice conversations meet the standard, 28% for intermediate users, and 33% for advanced and expert users.

But the most crucial information is hidden in the shape of this curve—the biggest leap occurs "from novice to intermediate".

That is to say, You don't need to be an absolute master in a certain field. As long as you have "enough confidence", you can get most of the benefits.

The profit curve flattens significantly from intermediate to expert.

The gap when hitting a wall is even more heart - wrenching.

When the conversation goes wrong, the user retries repeatedly, and starts to curse—Anthropic calls this a "troubled" conversation—the probability of novices finally turning the situation around (still being verified as successful under the premise of "encountering trouble") is only 4%, while that of experts is 15%.

Experts are not immune to hitting walls, but they know how to get the AI back on track when they do.

Even more heart - wrenching: In those conversations that are "judged as failures and the user gives up without writing a single line of code", 19% of novices directly give up, while only 5% - 7% of people with other experience levels do so.

The least experienced people are the first to admit defeat when encountering a hurdle—they don't lose because of their ability, but because they don't know what to say to the AI next.

Occupation? It's less important

Whether you are a programmer, a lawyer, or a product manager?

To be honest, it's not that important anymore.

Anthropic mapped users to 23 major categories using the standard occupational classification system of the US Bureau of Labor Statistics (BLS).

The classifier was clearly required: Don't assume that someone is a programmer just because they are writing code.

A lawyer who uses Claude to write a script for automatically reviewing contract clauses is still classified as a legal practitioner—because his core work is law, and code is just a means to an end.

Based on this classification, the verification success rate of software - related occupations is about 30%, and that of other occupations is about 26%; in conversations that actually produce code, it is 34% vs. 29%.

If we look at the broader "at least partially successful" standard, the gap shrinks to only 1 percentage point—89% vs. 88%.

Among the ten largest occupations in the dataset, the success rates of all fall within 7 percentage points of that of software engineers. And this gap has neither widened nor narrowed in seven months—the success rates of both sides are increasing synchronously.

The most unexpected thing is: The verification success rate of management positions is even slightly higher than that of programmers.

Anthropic also analyzed the reason itself—this may be because managers are better at clearly expressing "this is what I want" in the conversation, or it may be that commanding an AI and leading a team are the same kind of ability: Breaking down the requirements clearly, setting the direction clearly, and making judgments at key nodes.

This discovery can almost rewrite the proposition of "who is the most valuable in the AI era"— The answer is not the person who can write code best, but the person who can define the problem best.

This is just a preliminary answer

Of course, Anthropic was very cautious in its statement.

They admitted that they couldn't see the real business results—the so - called "success rate" comes from the classification judgment of conversation records, which doesn't mean that this code is ultimately adopted and really generates commercial value.

The conclusion is preliminary, don't over - glorify it.

But the direction is clear enough to send a chill down one's spine: In AI programming, the threshold for writing code is being leveled, and the value of "understanding the business" is being magnified wildly.

What is happening on Claude Code may very well be just a preview of the future direction of all knowledge - based work—

Anyone can let AI write code, but the real valuable skill in this era is the ability to think clearly about the problem and put forward the requirements precisely.

Reference materials:

https://www.anthropic.com/research/claude-code-expertise

This article is from the WeChat official account "New Intelligence Yuan", author: ASI Revelation. It is published by 36Kr with authorization.