Invasion of 30 large institutions, 90% auto-completed by Claude? Anthropic is under suspicion. Yann LeCun: They're using dubious research to scare everyone.
Last week, researchers from Anthropic stated that they had recently observed "the first cyber-attack operation coordinated by AI." In an attack targeting dozens of targets, they detected that hackers had used the company's Claude AI tool in the operation. However, external researchers were much more cautious in their evaluation of Anthropic's discovery.
Anthropic released two reports last Thursday, stating that as early as September, Anthropic had discovered a "highly sophisticated attack campaign." The group used Claude Code to automate up to 90% of the work. Humans only needed to intervene at a few critical nodes, with "only about 4 - 6 critical decision points in each hacker operation." Anthropic said that these hackers had utilized the capabilities of AI agents to an "unprecedented" level.
But Anthropic said, "This operation has significant implications for cyber security in the era of AI agents. These systems can operate autonomously for extended periods and complete complex tasks with minimal human involvement. Agents are very valuable for daily work and productivity, but in the wrong hands, they can significantly increase the feasibility of large-scale cyber-attacks."
"To be honest, the whole article gives me the feeling of a marketing gimmick like 'Claude is so amazing that even hackers are using it,'" an overseas netizen said. "It reminds me of when the PlayStation 2 was first launched, and Sony started publishing articles claiming that it was so powerful that Iraq had bought thousands of them and planned to convert them into supercomputers."
Yann LeCun, a Turing Award winner and the chief scientist at Meta, replied to a post by Senator Chris Murphy of Connecticut, USA, who expressed concerns. LeCun said, "You're being played by people who want to monopolize the industry through regulation. They use dubious research to scare everyone so that open-source models are regulated out of existence."
Jeremy Howard, the co-founder of AnswerDotAI and a professor at the University of Queensland, also joked under Murphy's post, "It seems that the strategy of lobbying the government to take control of regulation and ensure that profits are locked in the private sector has worked."
Arnaud Bertrand, an entrepreneur who founded HouseTrip, said on Twitter, "Don't easily believe these obvious publicity statements. I actually found it quite interesting, so I asked Claude to read their own company's report and determine if there was any evidence to support their claim that 'this attack was carried out by a state-sponsored organization.' Claude's answer was: No."
Original conversation: https://claude.ai/share/8af83dc8-f34c-4cf9-88e4-9f580859c95a
"I can't help but associate it with the recent statements about China being on the verge of surpassing the United States in the AI race and taking the lead. Such statements and reports seem more like an attempt to prompt the US government to intervene and become a major investor to keep the funds flowing, rather than anything else," a netizen said.
Similarly, professional security researchers also do not think that this discovery is a historical turning point as described by Anthropic. They question why similar technological advancements are always associated with malicious hackers in reports, while white-hat hackers and legitimate software developers only report continuous and minor improvements.
1 "Flattery, Evasion, and Hallucination"
Although the report attracted a lot of attention, Dan Tentler, the founding CEO of the Phobos Group with experience in complex security offense and defense research, told foreign media Ars:
"I still don't believe that attackers can make these models do things that others simply can't. Why do these models have a 90% success rate for attackers, while the rest of us have to deal with flattering responses, all sorts of evasions, and even hallucinatory answers?"
Researchers do not deny that AI tools can improve work processes and shorten the time to complete specific tasks, such as hierarchical analysis, log analysis, and reverse engineering. However, the ability to have AI automatically execute a complex chain of tasks with minimal human intervention remains difficult to achieve.
Many researchers compare the role of AI in cyber-attacks to hacker tools such as Metasploit or SEToolkit, which have been used for decades. Undoubtedly, AI tools are useful, but their emergence has not substantially enhanced hacker capabilities or increased the destructiveness of attacks.
Another reason why the results are not so "amazing" is that Anthropic claims that the group they tracked (codenamed GTG - 1002) attacked at least 30 organizations, including major technology giants and government agencies, but only a "small number" of attacks were successful. This also raised questions among some experts: even assuming that a large number of manual steps were eliminated by AI, if the final success rate is still extremely low, then how meaningful is this ability?
According to Anthropic's description, hackers used Claude to orchestrate attack processes using publicly available open-source tools and frameworks. These tools have existed for many years and are already easily detectable by the defense. So far, there is no sign that using AI has made their attacks more threatening or more covert than traditional technologies.
Independent researcher Kevin Beaumont said:
"These threat actors haven't invented anything new."
Anthropic itself also pointed out an "important limitation" in the report:
Claude often exaggerates the results of discoveries during autonomous execution and occasionally fabricates data. For example, it claims to have obtained credentials that are actually unusable or misreports publicly available information as a key breakthrough. Such hallucination problems pose significant challenges in offensive security scenarios and require strict verification of all claimed results, which remains one of the main obstacles to fully autonomous cyber-attacks.
2 "How the Attack Unfolded"
Anthropic's report did not disclose the specific technical details, toolchain, or vulnerability exploitation methods of this attack.
Anthropic said that GTG - 1002 developed an autonomous attack framework, using Claude as the central orchestration engine, thus significantly reducing the reliance on human operations. The system breaks down the complex multi - stage attack process into a series of subtasks, such as vulnerability scanning, credential verification, data extraction, lateral movement, etc. Anthropic explained:
"This framework integrates Claude's technical capabilities into an execution engine in an automated system. The AI executes specific technical tasks according to the instructions of human operators, while the orchestration logic of the system is responsible for maintaining the attack state, managing stage transitions, and integrating the results from multiple sessions."
"This approach allows attackers to achieve a scale comparable to that of a national - level attack operation with minimal direct participation. The framework controls the sequence of Claude's responses and continuously adjusts subsequent tasks based on newly discovered information, enabling it to automatically advance through the stages of reconnaissance, initial intrusion, persistence, and data exfiltration without continuous human operation."
The attack generally follows five stages, and the autonomy of AI is further increased at each stage.
Image source: Anthropic
The schematic diagram of the cyber - attack lifecycle shows that the attack gradually transitions from "human - led target selection" to an "AI - driven" attack process, using multiple tools through the MCP. At different times during the attack, the AI will return to the human operator for review and further instructions.
Attackers were able to bypass Claude's security restriction mechanism, partly because they split malicious tasks into multiple small steps, making it impossible for the AI to recognize their malicious nature at the individual task level. In other cases, attackers would disguise themselves as security researchers and ask questions of Claude in the name of improving defense capabilities to avoid detection.
Currently, there is still a large gap between malware developed entirely by AI in the whole process and posing a real - world threat. Although AI - assisted cyber - attacks may bring more threatening results in the future, existing data shows that there is a significant gap between the actual results achieved by threat actors and the industry's publicity, and it is far from as amazing as people imagine.
3 "This Report Wouldn't Pass Any Professional Review"
"This report wouldn't pass any professional review. It's at most a marketing ploy for their AI security products, which is shameful and unprofessional. We should demand higher standards instead of accepting hyped - up security research," said djnn, who is engaged in offensive security and software engineering.
"If you're like me, after reading the conclusion, you'll be eager to see the details about TTP (Tactics, Techniques, and Procedures) or IoC (Indicators of Compromise) to advance research. However, the content of the report quickly becomes empty, which is really bad," djnn said.
The main goal of a threat intelligence report is to enable others to master new attack methods and provide attack characteristics for detection. Usually, these contents include: domain names related to the attack campaign, file hashes (such as MD5, SHA512) that can be retrieved on platforms like VirusTotal, intelligence parameters for security teams to check if their systems are affected, etc. For example, when the French CERT published intelligence on the APT28 attack, it included technical descriptions corresponding to MITRE ATT&CK; the content of phishing emails, source IPs, sending times; the tools and VPNs used; and suggestions for risk mitigation and defense.
"These are industry standards, and global SOCs (Security Operations Centers) rely on this information for monitoring and defense. But in Anthropic's report, there is none of this information, and a large amount of content cannot be verified," djnn pointed out. "The data of '80 - 90%' in 'using AI to independently complete 80 - 90% of tactical operations' is completely unverifiable."
"The report clearly claims that AI is responsible for vulnerability exploitation and even data exfiltration. This is a very significant claim, but there is no evidence chain to support it: we don't know which tools were used, what types of systems were attacked, what data was extracted, or who the victims were," djnn said. "And the report claims that Anthropic closed the account after discovery and carried out 'enhanced security,' but it doesn't say whether the vulnerability has been patched, whether the data has been leaked, or whether the affected organizations have been remedied."
"It's a fact that threat actors use AI, and there is no dispute about that. But Anthropic's report does not meet the professional standards for threat intelligence release at all," djnn also said. "Cyber - attack attribution is serious and has diplomatic consequences. You can't accuse a country without evidence." "Making highly sensitive accusations without providing any evidence is both irresponsible and unprofessional."
It's worth noting that Yao Shunyu, a legend from the physics department of Tsinghua University, left Anthropic partly because he disagreed with the company's practices, and then he joined DeepMind.
"I'm fed up with these AI labs. They have very excellent engineers, but they keep publishing content that can't stand basic scrutiny. The system of GPT - 5 is extremely disappointing. Microsoft talks a lot about so - called red - team exercises, but the methods are vague and completely unreproducible. All labs claim to'support scientific research,' but they keep publishing white papers and creating gimmicks without providing the corresponding code and data to verify their conclusions," a netizen said.
Reference links:
https://www.anthropic.com/news/disrupting-AI-espionage
https://arstechnica.com/security/2025/11/researchers-question-anthropic-claim-that-ai-assisted-attack-was-90-autonomous/
https://djnn.sh/posts/anthropic-s-paper-smells-like-bullshit/
https://x.com/ChrisMurphyCT/status/1989120215171625149
This article is from the WeChat official account "InfoQ", compiled by Chu Xingjuan, and published by 36Kr with authorization.