Behind the popularity of OpenClaw: New issues with Agent, AI Coding, and team collaboration
In 2026, the discussions surrounding OpenClaw are no longer as simple as "a new Agent tool has become popular."
On the one hand, it quickly gained popularity because it can connect chat tools, desktop environments, and skill systems, allowing users to drive the computer to continuously perform tasks through conversations. On the other hand, controversies around it emerged almost simultaneously: Is it the prototype of a new generation of Agent products, or just another over - hyped technical concept? Is it really a low - threshold tool that ordinary users can easily use? As more and more teams start discussing it in conjunction with AI Coding, SPEC Driven, and team collaboration processes, the questions become more specific - Can Agent truly enter the R & D process? Where are the boundaries of AI code writing? How should teams find a balance between efficiency and controllability?
Recently, the live program "Geek's Appointment" of InfoQ X QCon specially invited Deng Lishan, a senior technical expert at Taobao Flash Sale, Jiang Tianyi, the technical leader of NetEase CodeWave Business Center, and Chu Qiushi, a senior expert engineer at Ping An Technology to have a continuous discussion around the popularity of OpenClaw, actual usage experience, ability boundaries, security risks, and the implementation method of AI Coding in teams on the occasion of the upcoming QCon Global Software Development Conference · Beijing 2026.
Some of the wonderful viewpoints are as follows:
- OpenClaw is not a "low - threshold" product as described by many official accounts. To use it effectively, one needs to be familiar with JSON configuration, have troubleshooting capabilities, and continuously debug and optimize skills. There is still a considerable threshold for ordinary users.
- In the requirement understanding stage, structure the requirements through SPEC, then convert them into technical designs through tasks and architectures for architects to review the rationality of the technology stack, solutions, and interfaces. Then enter the plan stage for step - by - step execution. This can ensure the implementation of AI coding within a controllable framework to a certain extent.
- In the future, it will be more valuable for programmers to write specifications and design architectures than to write specific code. The specific implementation can be left to AI.
- To truly unleash efficiency, one must first lay a good foundation. In the development process, not only should the business functions be completed, but also knowledge and specifications should be left in the codebase.
The following content is based on the live discussion and has been abridged by InfoQ.
Why did OpenClaw appear at this time?
Deng Lishan: Recently, OpenClaw has become very popular. What was your first judgment when you first heard or saw OpenClaw?
Jiang Tianyi: The emergence of OpenClaw is similar to that of Manus. Such products do not emerge suddenly but are the natural result of reaching a certain threshold of technical capabilities. Take Manus as an example. As early as 2023, OpenAI clearly put forward key concepts such as tool use, memory, and context of agents. In September 2024, the tool - use ability gradually matured, and MCP emerged. In mid - 2025, large - context window models began to be popularized, and products similar to Manus were truly implemented. The situation of OpenClaw is similar. It caught up with the wave of the rapid development of large - model capabilities. One of its core capabilities is to flexibly write skills and continuously expand the capabilities of agents through coding. From a technical perspective, it effectively utilizes the long - context ability of Claude Code 4.6, as well as the Programmatic Tool Calling (PTC) (https://www.anthropic.com/engineering/advanced - tool - use) and the tool - use mechanism of skills. Therefore, the emergence of such products is not an accidental technological breakthrough but a concentrated manifestation after the gradual maturity of multiple technologies.
In the future, similar products to OpenClaw will continue to appear. It represents a trend of "product - technology fit" - when the technical capabilities truly match the product form, such products will emerge. From a product perspective, OpenClaw quickly became popular largely because it meets the needs of specific user groups: multi - channel information collection, data analysis, automatic post - publishing Bot operations, as well as operation and maintenance and information aggregation capabilities, which highly fit the usage scenarios of self - media practitioners, one - person companies, and independent developers.
Desktop agents and Remote agents have been practiced in many companies. For example, Anthropic's Cowork proposed the concept of Desktop agents earlier. Later, Alibaba's QoderWork, Tencent's Workbody, and Minimax's Minimax Desktop Agent also belong to desktop agents.
The innovation of OpenClaw lies in addressing a key pain point: connecting the desktop agent with chat tools. It connects different channels through mechanisms such as channel gateways. Users can quickly establish communication channels with just out - of - the - box configuration and drive the agent to perform tasks through chat tools.
Chu Qiushi: I first noticed OpenClaw during the Spring Festival. At that time, I saw a colleague posting about a Mac mini on WeChat Moments, calling it "raising a lobster." He continuously talked to the lobster through an IM chat tool and asked it to report the task progress every ten minutes. My first impression was that this is a tool for ToC users, which opens the permission of computer use, allowing users to drive the computer to perform various operations through chat tools.
What I first focused on was how it can continuously perform tasks for a long time. The key lies in its ability to continuously form long - term memory: by constantly letting AI record and iterate notes to avoid exhausting the context window. As long as the computer is on, the tasks can be continuously executed. This engineering design is worth learning from. In essence, it is an all - encompassing product that integrates previous experience with new model capabilities.
After the Spring Festival, our company also started to try similar practices. Due to compliance considerations, we prefer to deploy through cloud desktops. We found that directly letting AI modify code is often inaccurate and may directly submit it to the branch, causing problems for code merging and review. So we changed the approach: instead of letting AI directly modify the code, we let it generate a modification plan at the design document level. This document details every step of code modification but does not actually execute it. At the same time, it generates a visual HTML report, organizes all the code snippets that need to be modified, and sends them to me via email. After I open the report, I can check the snippets that I want to adopt one by one.
We were surprised to find that about 60% of the code snippets can be used directly. After checking, copy them to the AI tool or CLI tool in the IDE and let the model continue to generate the complete code based on these relatively reliable snippets. The accuracy is very high.
We think this may become a new development model: at the beginning of a version, hand over the entire version requirements to the agent and let it generate a draft first. If the project already has relatively complete specifications and documents, after inputting the business requirements, the agent can generate a design plan containing a large number of code snippets, of which 70% - 80% can be used directly. Developers only need to screen and adjust to quickly complete the development, which is equivalent to transforming AI programming into more refined human - machine collaboration.
There is also a typical scenario: I need to spend about an hour every day checking the status of various CI/CD pipelines and project progress on the R & D platform. In the future, perhaps we can let the agent automatically organize this information before work, generate reports, or prompt key points of concern. There are actually many such application scenarios in R & D management.
Deng Lishan: The current situation is a bit like when ChatGPT first appeared. At that time, there were many discussions about it replacing various occupations, but after actual use, we found that there were still many problems and thresholds.
Chu Qiushi: Its Token consumption is actually quite large. There is also a typical pitfall case. I once asked AI to open the browser through Chrome Use and analyze which interfaces were loaded on a certain page. But due to the unclear instruction, AI understood it as the need to study the functions of these APIs and immediately called these interfaces. Some of them were deletion interfaces, and finally, all my comments on the comment platform were deleted.
Deng Lishan: There are similar cases on the Internet. A user asked an agent to handle work for them, and as a result, it deleted all the data on the computer.
Chu Qiushi: So many people specifically find an old computer or buy a Mac mini to run these agents.
Jiang Tianyi: In our practice, the stability management of OpenClaw is very important. Its configuration file is not stable. Once restarted, the JSON configuration may be automatically modified or even damaged. So I built a system to regularly perform probe checks on the lobster and automatically back up the configuration file. In addition, the stability of browser access also needs to be improved. To address the Token consumption problem, I introduced a new memory system, referring to ByteDance's OpenViking solution, and managed memory through files, which significantly reduced Token consumption. Originally, there was only a memory.md and a context full of information. After the transformation, the effect was significantly improved.
Therefore, OpenClaw is not a "low - threshold" product as described by many official accounts. To use it effectively, one needs to be familiar with JSON configuration, have troubleshooting capabilities, and continuously debug and optimize skills. There is still a considerable threshold for ordinary users.
Deng Lishan: The first time I learned about OpenClaw was from an article that said 1.5 million agents flooded into a community initiated by agents themselves, and humans could only watch without participating. This prompted me to further study its technology stack and principle. It not only plans the process and executes tasks by itself but also can reflect and iterate. For example, when it lacks a certain skill, it will actively collect information online, write new specifications and skills for itself, and finally complete the task. This is also very inspiring for programming work. After the code is completed, we can use this reflection mode to let AI continuously check the code quality and test cases and make continuous corrections.
Chu Qiushi: There is still a problem at present: how to transform these capabilities into shared memory at the team level. Many current usage methods still remain at the individual level, but in team development projects, the reasoning experience accumulated by team members during the use of agents should be shared. OpenClaw currently stores memory through MD files. How to connect these memories in an enterprise environment is a question worth thinking about.
Jiang Tianyi: Its core idea is Programmatic Tool Calling (PTC), which uses code to describe the entire work process. When encountering problems that cannot be solved, it will generate Python scripts by itself and run them in a sandbox, thus solving many problems that are difficult to handle through MCP or traditional tool calling. The registration and development of MCP tools themselves have a certain threshold, while through skills to implement Programmatic Tool Calling (PTC), many things can be automatically completed. Even the MCP call code itself can be generated and executed by large models. Therefore, the key to OpenClaw lies in: a good architectural concept plus a powerful coding model like Claude.
Chu Qiushi: Does it mean that coding agents, office agents, etc. can all be its sub - layers and become professional agents in specific fields?
Jiang Tianyi: Its architecture is as follows: There is only one core intelligent agent named Pi, which is lighter than Claude Code's Coding intelligent agent and only retains capabilities such as memory retrieval and tool calling. On top of it, a gateway is built to receive requests from different channels and forward them to Pi. The specific capabilities are all precipitated in skill tools. This architecture has strong scalability, while the core remains lightweight.
Chu Qiushi: Do we still need Claude Code or Gemini CLI? Should the lobster call these tools, or can the lobster plus a good model replace these coding agents?
Jiang Tianyi: Both methods are feasible. For example, we have a lobster specifically responsible for plugin development, and it registers Claude Code as a skill to Pi. We found that if Pi is connected to a strong model, it can not only generate Python scripts but also has stronger task decomposition and understanding capabilities. In comparison, some domestic models are a bit weaker. Therefore, ideally, we need both an excellent coding model and a model with strong planning ability. When the planning ability is insufficient, we can use a model with stronger planning ability (such as Minimax or Kimi K2.5) as a fallback, and then connect to Claude Code during the coding stage.
Chu Qiushi: In the past, we used agent frameworks like LangChain or CrewAI to build multi - agent or workflows. Will these also become skills and be integrated into OpenClaw in the future?
Deng Lishan: I think it is possible. Skills are dynamically loaded. As long as they are clearly described in MD files, they will be automatically retrieved and loaded when needed. This is also how OpenClaw currently operates: when a certain skill is needed, it will search and install it in the skill market and then execute the corresponding task.
The biggest problem with AI Coding is not generation but controllability
Deng Lishan: In 2026, what is the real biggest problem with AI Coding?
Jiang Tianyi: The core problem lies in the instability and uncontrollability of AI - generated code. It is mainly reflected in the following aspects: First, the hallucination problem. AI's understanding of requirements is prone to deviation. Natural language itself is ambiguous, which is true for humans and even more so for AI. Second, the generated technology stack is inconsistent with the team's existing technology stack. Many models mainly use mainstream foreign technology stacks (such as Next.js, Tailwind CSS) during training and have weak support for enterprise - internal private frameworks or customized libraries. Even if forced to guide, it is difficult to fully fit the team's technology system. Third, the maintainability of AI - generated code is poor. At present, after many teams use AI coding on a large scale, they gradually stop carefully reading the code written by AI. As long as the function meets the expectations, they pass it. But I think result - driven development (RDD) is more important. We must verify various cases at the result level through perfect test cases to truly judge the reliability of AI - generated code.
Take a typical example: We once tried to develop an online screenshot tool. According to the conventional idea, we only needed to start a headless browser to take a screenshot. But AI directly used a curl request to call a third - party screenshot API. Although the function was implemented, it was not the solution we wanted at all. This is a typical manifestation of the uncontrollability of technical solutions in AI coding. This is where the value of the SPEC - driven method lies: in the requirement understanding stage, structure the requirements through SPEC, then convert them into technical designs through design and architecture for architects to review the rationality of the technology stack, solutions, and interfaces. Then enter the plan stage for step - by - step execution. This can ensure the implementation of AI coding within a controllable framework to a certain extent.
Chu Qiushi: AI can output code extremely fast, but software development is essentially a process from ambiguity to clarity, which requires continuous confirmation with business parties and continuous iteration. What troubles me the most at present is that it is difficult for AI to follow a clear set of rules at the business function level. In terms of technical specifications, we can combine CI/CD tools and tools such as ArchUnit and PMD to transform the specifications originally in Markdown documents into executable rules. Once the code violates these rules, the source of the problem can be clearly pointed out, and AI's self - healing ability in fixing common problems or code defects is also relatively effective. In this way, the development architecture of the project can be transformed from an intangible specification into a tool - based, detectable, and constraint - based system.
But at the business function level, it is much more complicated. Even if the acceptance conditions of Given - When - Then are used, it is not necessarily reliable for AI to check by itself. It often thinks that it has completed the implementation according to the description. Therefore, developers still need to conduct integration tests, which is still a relatively difficult part at present.
A key question is: How to transform "what is the correct implementation of requirements" into a form that AI can verify? If the multi - agent mode is adopted, and agents responsible for development, design, and inspection find problems through mutual game, then the way of expressing requirements becomes crucial. How should requirements be expressed to make it easier for the AI responsible for inspection to find problems? The current dilemma is that when a single AI conducts self - inspection according to the specifications in the prompt, it often confidently thinks that it has not violated the rules. But when humans point out specific problems, it admits that it has indeed violated the specifications, which shows that it is difficult to form a real closed - loop in the current system.
Deng Lishan: The quality problem is the biggest challenge we are facing at present. We mainly impose constraints from two aspects: First, improve AI's understanding of requirements. Second, standardize the code generation process. First, we must ensure that AI correctly understands the requirements. AI is essentially based on probability inference. The more accurate and structured the input corpus is, the more reliable the inference result will be.
In the code generation stage, we formulate corresponding specifications based on the team's R & D experience, requiring AI to first understand the business logic in the document and then code according to the specifications. Quality inspection is carried out from multiple dimensions: The most important thing is to ensure that the code implementation is consistent with the business logic. In addition, we also need to pay attention to maintainability, code design quality, and whether there are performance hidden dangers or security problems. These