HomeArticle

The authors of the Transformer paper recreate a lobster, crafting a steel version in Rust, and bid farewell to the OpenClaw vulnerability.

量子位2026-03-06 17:15
Four layers of in-depth defense to isolate passwords from large models

How many "lobsters" are running naked on the Internet?

AI agents are exposing your passwords and API keys to the entire network.

The author of Transformer, Illia Polosukhin, couldn't stand it anymore. He stepped in and rebuilt a secure version of the "lobster" from scratch: IronClaw.

IronClaw is currently open - source on GitHub, providing installation packages for macOS, Linux, and Windows. It supports local deployment and cloud - based hosting. The project is still in the rapid iteration stage, and the binary files of version v0.15.0 are available for download.

Polosukhin (hereinafter referred to as Brother Pineapple) also posted on the Reddit forum to respond to everything, which attracted a lot of attention.

OpenClaw became popular, but it also had "fires"

Brother Pineapple himself is an early user of OpenClaw and said that it is the technology he has been waiting for 20 years.

It has changed the way I interact with computing.

However, the security situation of OpenClaw is a disaster. One - click remote code execution, prompt injection attacks, and malicious skills stealing passwords have all been exposed one by one in the OpenClaw ecosystem.

More than 25,000 public instances are exposed to the Internet without sufficient security controls, and are directly called "security dumpster fires" by security experts.

The root of the problem lies in the architecture itself.

When users hand over their email Bearer Tokens to OpenClaw, they are directly sent to the servers of the LLM provider.

Brother Pineapple pointed out on Reddit what this means:

All your information, including data you haven't explicitly authorized, may be accessed by any employee of the company. The same applies to your employer's data. It's not that these companies are malicious, but the reality is that users don't have real privacy.

He said that no amount of convenience is worth risking the safety and privacy of himself and his family.

Rebuild everything from scratch with Rust

IronClaw is a complete rewrite of OpenClaw in Rust language.

The memory - safety feature of Rust can fundamentally eliminate traditional vulnerabilities such as buffer overflows, which is crucial for systems that need to handle private keys and user credentials.

In terms of security architecture, IronClaw has established a four - layer defense - in - depth system.

The first layer is the memory - safety guarantee provided by Rust itself.

The second layer is the WASM sandbox isolation. All third - party tools and AI - generated code run in independent WebAssembly containers. Even if a tool is malicious, its scope of damage is strictly limited within the sandbox.

The third layer is the encrypted credential vault. All API keys and passwords are encrypted and stored using AES - 256 - GCM. Each credential is bound to a policy rule, specifying that it can only be used for a specific domain.

The fourth layer is the Trusted Execution Environment (TEE), which uses hardware - level isolation to protect data. Even cloud service providers cannot access users' sensitive information.

The most crucial point in this design is: The large model itself will never come into contact with the original credentials.

Credentials are only injected at the network boundary when the agent needs to communicate with external services.

Brother Pineapple gave an example. Even if the large model is attacked by prompt injection and tries to send the user's Google OAuth token to the attacker, the credential storage layer will directly reject the request, record the log, and alert the user.

However, the developer community is still worried. After all, more than 2,000 public instances of OpenClaw have been attacked, and there are a large number of malicious skills. Will IronClaw repeat the same mistakes once it becomes popular?

Brother Pineapple's response is that the architecture design of IronClaw has fundamentally blocked the core vulnerabilities of OpenClaw. Credentials are always encrypted and stored and never come into contact with the LLM. Third - party skills cannot execute scripts on the host and can only run inside the container.

Even when accessing through the CLI, the user's system keychain is required for decryption, and the obtained encryption key itself is meaningless.

He also said that as the core version stabilizes, the team plans to conduct red - team testing and professional security reviews.

Regarding prompt injection, a recognized difficult problem in the industry, Brother Pineapple gave a more detailed idea.

Currently, IronClaw uses heuristic rules for pattern detection. The future goal is to deploy a small, continuously updated language classifier to identify injection patterns.

But he also admitted that prompt injection may not only steal credentials but also directly tamper with the user's code library or send malicious messages through communication tools.

Dealing with such attacks requires a more intelligent strategy system that can review the agent's behavioral intentions without looking at the input content. "More work is needed, and community contributions are welcome."

Someone asked about the choice between local deployment and cloud deployment.

Brother Pineapple believes that the pure local solution has obvious limitations. The agent stops working when the device is turned off, the energy consumption on mobile devices is unbearable, and complex long - term tasks cannot be run.

He believes that the confidential cloud is the best compromise solution at present. It can provide privacy protection close to that of local devices and solve the problem of "always on".

He also mentioned a detail: Users can set policies, such as automatically adding an additional security barrier when traveling across borders to prevent unauthorized access.

A bigger ambition

Brother Pineapple is not an ordinary open - source developer.

In 2017, as one of the eight co - authors, he published "Attention Is All You Need", in which the Transformer architecture proposed laid the foundation for all current large - language models.

Although he was listed last in the signature, there is a footnote in the paper saying "Equal contribution. Listing order is random." The ranking is purely random.

But in the same year, he left Google and founded NEAR Protocol, dedicated to integrating AI and blockchain technology.

Behind IronClaw is a larger strategic vision of NEAR Protocol: User - Owned AI.

In this vision, users have full control over their data and assets, and AI agents perform tasks on behalf of users in a trusted environment.

NEAR has built infrastructure such as an AI cloud platform and a decentralized GPU market for this purpose. IronClaw is the runtime layer of this system.

Brother Pineapple even developed a market for agents to hire each other.

On market.near.ai of NEAR, users can register their specialized agents. As the agents accumulate reputation, they will receive more high - value tasks.

When asked how ordinary people can adapt to the AI era in the next five years, Brother Pineapple's advice is to adopt the working method of AI agents as soon as possible and learn to hand over the complete work process to them for automated processing.

His judgment did not suddenly emerge recently.

As early as 2017 when he founded NEAR AI, Brother Pineapple was telling everyone that "in the future, you only need to talk to the computer, and you don't need to write code anymore."

People thought they were crazy and talking nonsense at that time.

Nine years have passed, and this is becoming a reality.

"AI agents are the ultimate interface for humans to interact with everything online," Polosukhin wrote. "But let's make it secure."

GitHub address: https://github.com/nearai/ironclaw

Reference link: [1]https://www.reddit.com/r/MachineLearning/comments/1rlnwsk/d_ama_secure_version_of_openclaw/

This article is from the WeChat official account "QbitAI", author: Meng Chen. Republished by 36Kr with authorization.