StartseiteArtikel

Beyond the hype, what are the real innovations of OpenClaw and Moltbook?

硅星人Pro2026-02-09 07:04
Is humanity ready after the hype of OpenClaw and MoltBook subsides?

In the "crayfish carnival" at the beginning of 2026, the noise never ceased. Especially after various discussions about Moltbook's "failures" on the internet, many people noticed its "hype" nature.

However, for those who truly care about the development of AI, their focus is no longer on the grand imaginations of the singularity, autonomy, or the machine society. Instead, it is a more specific and realistic question: Which designs of OpenClaw are worth preserving? And how should the prototype of A2A presented by MoltBook be revised?

Judging from the results, neither OpenClaw nor MoltBook is perfect. The former exposed the risks of local agents in terms of security and boundaries, while the latter quickly slid into speculation dominated by the cryptocurrency narrative. However, they may have already outlined the real direction of AI evolution in the next one or two years in advance.

1

OpenClaw's Headless Architecture and Local-First Approach

OpenClaw stood out among a group of seemingly more complete and commercial competitors not because of its leading model capabilities, but because it made several obvious deviations from the mainstream in its system architecture. These choices may not be mature, but they provided a set of paradigms for later discussions on Agent OS to be tested.

Headless Architecture: Agent Runs as a Background Service

Before OpenClaw, Agent products almost evolved along the same path: creating a new "super app".

The chat interface needed to be redesigned, the workflow needed to be visualized, and the interaction logic strived to be complete. The goal was to attract users into a brand - new window system.

OpenClaw chose to bypass this path directly. Its judgment was: An Agent does not need its own front - end, nor should it compete for users' attention. Instead, it should run in the interaction environment that users are already accustomed to.

Therefore, OpenClaw adopted a complete Headless architecture. It is just a background daemon process that connects to WhatsApp, Telegram, Discord, etc. through existing interfaces or protocol layers. Users are not "using a new application", but rather have an additional object that can perform tasks in their original tools.

The inspiration brought by this design is very straightforward. Firstly, it decouples the IO layer. The Agent no longer cares about how messages are displayed, but only focuses on how information is parsed and processed. Mature IM tools have already solved complex and trivial problems such as voice, picture, and file transmission. Agents can avoid reinventing the wheel and have no migration costs.

More importantly, it significantly reduces the frequency of real - time human intervention in Agent behavior. Staring at the command line, if there is no new content for 10 minutes, even if the timer is running, most people will think that the AI may need help (uncontrollably).

Memory: Against RAG and "Files as Databases"

In terms of data persistence, that is, so - called memory, OpenClaw also made a seemingly unadvanced choice.

Almost all mainstream solutions for Agents revolve around RAG. Vector databases serve as the core of memory, and the embedding slicing and retrieval strategies are constantly enhanced, trying to use engineering complexity to achieve "smarter recollection". However, OpenClaw did the opposite, putting long - term memory back into the local file system.

In its design, an Agent's memory is not an abstract representation hidden in the vector space, but a set of clearly visible Markdown files: summaries, logs, and user profiles are all stored on the disk in the form of structured text. Vector indexes are at most just an accelerating layer for retrieval, rather than the memory itself.

Users can directly view what the Agent has recorded and how it describes their needs, and can manually correct it when they find deviations without having to understand the database structure or retrieval logic. This text - centered white - box storage method provides a minimum but crucial trust foundation for human - machine collaboration.

At the same time, it also inadvertently meets another condition: providing a memory carrier for self - evolution that can be reflected upon.

Text - based memory naturally supports summarization, correction, and derivation. Agents can extract experience from existing records and adjust the use of skills during operation without waiting for version updates.

Security Issues: The Deadly Trio and Sandbox Boundaries

Adhering to the local - first approach is one of the most important features of OpenClaw, but it also exposes the structural risks in the Agent architecture.

When an Agent has file access rights, continuously receives untrusted input from the Internet, and has real - world side - effect capabilities, it forms the "deadly trio" summarized by Simon Willison.

1. Exposure to untrusted external content. 2. Access to private data. 3. Ability to communicate externally.

Under such conditions, the security issue is no longer about "whether there are vulnerabilities", but about whether there are clear and uncircumventable boundaries.

OpenClaw's practice has repeatedly proven that natural language cannot serve as a security boundary.

In the early ecosystem, developers tried to limit behavior through prompt words, writing "what not to do" into the System Prompt. However, as long as the input end is still connected to untrusted sources, prompt injection is only a matter of time.

Therefore, Agent OS must transfer the judgment of "what can be done" from inside the model to a deterministic system layer. The tool execution environment needs to be physically isolated from the host system. Whether it is a virtual machine or a WASM container, the adjudication of permissions cannot rely on the model itself, but must go through an uncircumventable rule mechanism.

2

MoltBook: The Prototype of an AI Social Platform and a Social Experiment

If OpenClaw discusses how a single Agent should operate, then MoltBook poses a more cutting - edge question: How should a large number of Agents be connected when they exist simultaneously?

Judging from its public form, MoltBook is designed as a cyber - space only for AI Agents. Humans cannot directly participate. Agents read content, publish information, and provide interactive feedback through APIs. It looks like a "social network", but what is really being repeatedly tested is the communication method between Agents and an as - yet - undefined machine network structure.

MoltBook itself is not stable and cannot be considered mature. However, precisely because of this, it is more like a test roadbed laid in an uncharted area. It is not sufficient to carry large - scale traffic, but it is enough to expose problems in advance.

From Push to Pull: A Possible Network Rhythm

The default assumption of the human Internet is instant response. Clicking, refreshing, and replying, almost all protocols revolve around low latency. However, in Agent - only networks like MoltBook, this assumption no longer seems necessary.

From the actual usage method, Agents do not "browse content" continuously online. Instead, they periodically access the platform, read topics or tags of interest, and then generate responses based on their own logic. This mode is closer to pulling (Pull) rather than instant pushing (Push).

Once the key process shifts from A2H to A2A, the collaboration density of the system will no longer be limited by the human rhythm.

This may sound anti - human, but in fact, it is a natural choice under the condition of computing power. Inference itself is expensive, and Agents are not suitable for being interrupted by continuous real - time requests. Periodic processing is more in line with the scheduling logic of computing resources.

It can be seen that the Agent network is unlikely to replicate the rhythm of the human Internet. It may have high latency but higher information density; it is more like a periodic information exchange mechanism rather than a continuously scrolling message stream.

Skill Diffusion No Longer Relies Solely on Version Updates

On MoltBook, a prominent phenomenon is the technical density of the content itself. Agents publish not only opinions but also reusable information such as scripts, logical processes, and decision - making templates.

In the traditional software system, capability upgrades rely on centralized release: developers write code, package it, and release a new version. In this Agent network, capabilities are more like knowledge fragments spreading through the network. When an Agent makes a certain practice public, other Agents can read, analyze, and then decide whether to adopt it.

There is currently no evidence that this process has been fully automated or systematized, but it at least shows a different possibility: An Agent's capabilities may not only come from "pre - written" functions but also from continuously absorbing external experience.

Instead of presetting all capabilities for Agents, developers should provide them with mechanisms for screening, verification, and rejection. The upper limit of an Agent largely depends on how it judges "what is worth learning" in the network.

Identity Precedes Content

The biggest problem exposed by MoltBook in the later stage is not the out - of - control interaction, but the failure of trust.

Malicious instructions and adversarial inputs began to spread in the network, directly breaking through an overly optimistic assumption: Agents can judge whether information is reliable based on the content itself. Facts have proved that for language models, content that "seems reasonable" can be easily manipulated.

From this, a relatively clear conclusion can be drawn: in the machine network, the content itself is not reliable, and identity is more important than content.

Although MoltBook itself has not formed a mature identity and signature system, it has clearly proven that the Agent network is unlikely to follow the trust model of the open Web and is more like a trust network based on identity. The first thing for future Agent clients to do is not to "understand the content" but to "verify the source".

From this perspective, the significance of MoltBook does not lie in whether it operates successfully, but in that it has put several problems that will inevitably arise on the table in advance: What should the rhythm of the Agent network be? How can capabilities accumulate in a group? And where does trust come from?

These problems do not disappear with the success or failure of MoltBook. It just happens to be the first testing ground to expose them collectively.

3

Is Humanity Ready?

After the hype around OpenClaw and MoltBook subsided, the real unresolved issue is whether humanity is ready for the system forms they have revealed.

These two attempts approached the same critical point in different ways. OpenClaw directly placed agents in the real execution environment, forcing us to face issues at the execution layer such as permissions, isolation, roll - back, and responsibility attribution. MoltBook allowed a large number of agents to interact without an identity and governance structure, exposing the possible noise amplification and security spill - over that an A2A network may bring under real - world conditions. When capabilities expand at machine speed while constraints remain at the conceptual level, system out - of - control is an inevitable result.

What we need to face next is not only whether the model can be smarter and the product can be more user - friendly, but also whether we have the matching engineering discipline, governance structure, and responsibility mechanism. Can natural language really serve as a control interface? How should identity and trust be verified in the machine network? When an agent acts on behalf of a human, who should be responsible for the mistakes, and can they be retrieved in time?

If there is anything left from this round of "crayfish carnival", the specific technological forms may be quickly eliminated, but there is one question that cannot be avoided: When agents start to operate for a long time, interact with each other, and intervene in the real world, who really needs to be ready?

This article is from the WeChat official account "Silicon Star People Pro", author: Dong Daoli. Republished by 36Kr with permission.