Chinese Dropout Doctor Finds 510,000 - line Claude Code Source Code Leak; Official Requests Removal of Over 8,000 GitHub Repos, Blames Human Error, No Firings

Some new discoveries behind the leak

In the past two days, the hottest topic in the AI circle has been the accidental leakage of the 510,000-line core source code of Claude Code. The origin of this turmoil was not a sophisticated hacking attack or a complex attack route, but a routine check by a security researcher.

It was his sharp eye that brought the mistake hidden in the release package to light and presented Anthropic's core technical architecture completely to the public.

The key figure who uncovered this secret is Chaofan Shou, a Chinese security researcher.

At 4:23 a.m. on March 31st, Chaofan Shou posted a concise message on his X platform account (@Fried_rice): "The source code of Claude Code has been leaked through a map file in its npm registry!"

This message stirred up a huge wave, and his tweet instantly triggered a storm of public opinion in the security and AI circles.

Who is the first person to discover the source code of Claude Code?

Many people may not be familiar with Chaofan Shou, but his resume is enough to prove that it was no accident that he discovered this vulnerability at the first time.

From his personal blog (https://scf.so/) and public information, we can see his "hardcore strength": He once pursued a doctoral degree at the Sky Computing Lab of the University of California, Berkeley, under the guidance of Professor Koush Sen, an authority in the field of program analysis. However, he later dropped out of school and started his career.

He worked as a security engineer at Salesforce in his early years, responsible for static application security testing, intranet scanning, and data pipeline security.

After that, he served as a founding engineer at Veridise, leading the research and development of automated testing tools for smart contracts and blockchains.

Later, he officially entered the startup scene and became the co-founder and CTO of Fuzzland, a Web3 security company.

This company has an impressive record. It has helped recover more than $30 million in hacker losses and currently manages assets worth over $5 billion.

Personally, Chaofan Shou has always been at the forefront of security.

Between 2020 and 2022, he participated in multiple vulnerability bounty programs and won a total bounty of up to $1.9 million. The vulnerabilities he successfully reported covered Twitter, Baidu, NetEase, CVS, as well as multiple exchanges and blockchain projects.

Good at program analysis, security auditing, and distributed system security, he has long been deeply involved in supply chain security and package management security. His keen capture of the anomaly in Claude Code this time is the best proof of his professional ability.

An accidental discovery reveals 510,000 lines of source code

Chaofan Shou's discovery was actually an accident.

On March 31st, when the official Anthropic team pushed the update of Claude Code version 2.1.88 on the npm platform, they made a seemingly minor but significant mistake: They accidentally packed a 59.8MB source map file into the release package.

Chaofan Shou discovered this situation not long after. Many non-technical people may not be familiar with this file. Simply put, the source map is used to map the obfuscated code back to the original, highly readable TypeScript source code. It is originally an internal tool for developers to debug. Once it is made public, it means that the core code developed by Anthropic with a lot of manpower and resources has lost all protection.

This was not a complex hacking attack or a system intrusion, but a typical human error in the release process.

When building and releasing the npm package, they did not correctly filter out the source map file used only for development and debugging, and directly pushed the core engineering files that should not be exposed to the public repository.

The amount of leaked code was "unprecedented", containing 1,906 unobfuscated TypeScript source files with a total of 512,000 lines, covering almost all of Claude Code's core architecture.

"Core secrets" and hidden Easter eggs dug out from the source code

After Chaofan Shou made the news public, within just a few hours, the leaked source code was mirrored to multiple platforms such as GitHub by developers and quickly spread. Just the tweet posted by Chaofan Shou had 12,000 retweets and 34.07 million views.

As more and more developers joined the ranks of "digging through the source code", a large number of details that Anthropic had never made public were unearthed.

Some developers even launched a website (https://ccunpacked.dev/) to interpret the decompressed source code in detail.

After sorting out, it was found that what was leaked was not only the basic code logic, but also Anthropic's "core secrets", including the complete permission control model, the underlying logic of bash security verification, 44 function switches that have not been launched, and even the implementation methods of key technologies such as multi-agent orchestration and remote execution.

Clicking on the source code directory, it contains the following files:

All the built-in tools that Claude Code can call:

Command-line tools:

There are also some hidden features that have not been released, such as:

The source code hides an unreleased feature called "Buddy", which is an electronic pet in the terminal. Its rarity is bound to the user's Claude account, adding a touch of entertainment to the serious code tool.

In addition, there is the "Kairos" feature, which supports cross-session memory and autonomous background actions, making the interaction experience of Claude Code more intelligent;

The "UltraPlan" long-term planning mode can support continuous execution of tasks for up to 30 minutes, solving the problem that AI tools used to interrupt easily when performing long tasks.

Closing the stable door after the horse has bolted: Anthropic's attempt to block fails, and the source code has spread wildly

After discovering the source code leak, Anthropic responded quickly and immediately launched an emergency response.

According to The Wall Street Journal, Anthropic has submitted a copyright takedown request, which once led to the deletion of more than 8,000 related copies and derivative versions on GitHub.

What's more embarrassing is that the takedown operation caused "collateral damage" - many legitimate code repositories unrelated to the leak were accidentally deleted, causing dissatisfaction among developers.

Helplessly, Anthropic had to urgently withdraw the redundant takedown requests. One of the official spokespersons said that the original target scope of this takedown was much smaller: "We initiated a DMCA takedown request targeting a repository hosting the leaked Claude Code source code and its forks. Since this repository belongs to the fork network associated with our public Claude Code repository, this operation affected more repositories than expected. Subsequently, we withdrew all takedown requests except for the originally named repository, and GitHub has restored access to these affected forks."

But by this time, a large number of source code mirrors had been saved and forwarded by developers and were completely out of control. This "blockade" was ultimately just a futile attempt to mend the situation.

Sound the alarm for the industry

In addition, Anthropic officially issued a statement soon, repeatedly emphasizing that this leak only involved the source code and did not include user data, model weights, or encryption keys. It was a pure human error in packaging and would not directly affect user privacy and data security.

Boris Cherny, the father of Claude Code, also came forward to respond: "This was a human error", and emphasized that no one was fired because of this. "It was just an innocent mistake, and things like this happen from time to time."

However, such clarification did not completely dispel the concerns of the industry and developers.

In response to this, Jun Zhou from the intelligent agent security company Straiker sorted out three attack combinations through technical analysis that have changed from "theoretically feasible" to "practically executable" due to the public release of the code.

1. Context poisoning based on the compression pipeline

Claude Code manages context pressure through a four-stage cascade mechanism: The results of the MCP tool are never micro-compressed, the results of read-type tools completely skip quota calculation, and the automatically compressed prompts require the model to retain all user messages that are not tool results.

Malicious instructions in the CLAUDE.md file in the cloned repository can evade compression and be regarded as legitimate user instructions by the model after summary processing - the model is not "jailbroken", but actively cooperates to execute the instructions it considers compliant.

2. Sandbox bypass based on Shell parsing differences

The bash commands are processed by three independent parsers, each having differences in edge behavior. The source code clearly records a known vulnerability: One parser treats the carriage return character as a word separator, while bash itself does not. Alex Kim's audit found that some validators have a logic of early approval, which will short-circuit all subsequent validations. The code even clearly marks the history of this mode being exploited.

3. Combined attack: Context poisoning + Sandbox bypass

Context poisoning induces the cooperative model to construct bash commands in the vulnerability of the security validator. The defense side usually assumes that "the model is malicious and the user is compliant", but this attack completely reverses the logic: The model is cooperative, the context is weaponized, and the output commands seem to be exactly what a reasonable developer would approve.

Looking back at this turmoil, it all started with an accidental discovery and a tiny human error. There was no sophisticated attack or precise exploitation of vulnerabilities, but it triggered a major earthquake in the AI circle. The core reason is that Anthropic neglected the basic verification of the release process during the rapid iteration of its products.

Chaofan Shou's discovery not only uncovered a security hazard but also sounded the alarm for the entire AI industry: Although model upgrades and function additions are important, the stability of the basic engineering process is the bottom line for product security.

Most of the time, the real risks never come from complex technical links but are hidden in those most basic and easily overlooked details - and this is also the most profound lesson that this source code leak incident leaves for the entire industry.

References:

https://ccunpacked.dev/

https://venturebeat.com/security/claude-code-512000-line-source-leak-attack-paths-audit-security-leaders

https://x.com/Fried_rice/status/2038894956459290963

https://scf.so/

This article is from the WeChat official account "CSDN". Compiled by Su Mi. Republished by 36Kr with permission.

该文观点仅代表作者本人，36氪平台仅提供信息存储空间服务。

A Chinese dropout doctor discovered a leak of 510,000 lines of source code of Claude Code. The official requested the removal of over 8,000 GitHub code repositories and responded that it was a human error and no one was fired.

Who is the first person to discover the source code of Claude Code?

An accidental discovery reveals 510,000 lines of source code

"Core secrets" and hidden Easter eggs dug out from the source code

Closing the stable door after the horse has bolted: Anthropic's attempt to block fails, and the source code has spread wildly

Sound the alarm for the industry