Global Agents Compete on "Loop Engineering": AI Works, Supervises, and Revises Itself

Kill the game.

Yesterday, Xiaolei came across a post by Andrew Ng on X, which was about the Loop engineering of Agents.

Image source: X

If you've used Claude Code, Codex, Workbuddy, Kimi Work, or other Agent products in the past six months, you may have already felt the change. Compared to teaching AI what to do step by step in the past, now you just throw in the requirements, and it can write, run, and correct errors on its own until the task is completed.

This feeling of "self - running" is the implementation of Loop engineering in Agent products.

"... You shouldn't write prompts for the Coding Agent anymore; you should design the Loop." In June, a tweet by Peter Steinberger, the founder of OpenClaw, sparked a lot of discussion. In fact, not long before that, Addy Osmani, the engineering director at Google, systematically organized the concept of Loop and proposed Loop engineering.

Image source: X

However, from the public information, the person who first proposed Loop and also made it shine is Boris Cherny, the founder of Claude Code. At the end of June, Anthropic also published a blog post, making all four types of loop primitives in Claude Code public.

In just one month, Loop engineering has basically completed a magnificent transformation. It has not only become an industry consensus but also the focus of the mainstream. So, what exactly is Loop? And what impact does it have on ordinary users?

From prompts to Harness and Loop, what has Agent gained?

In the past, when it came to using AI, prompts were unavoidable. If you asked the model to write code, modify documents, or conduct research, the more detailed the prompts, the better the results. As we move from models to Agents, the model not only needs to answer but also knows when to read files, run commands, search the web, and when to ask humans for help. Prompt engineering and chain - of - thought are simply not enough.

So, Harness engineering emerged. It can be understood as the operating framework outside the model, responsible for connecting tools, managing permissions, inserting context, and storing states. The model is still responsible for reasoning and generation, but it is now placed in an environment where it can execute tasks.

Loop engineering takes it a step further. It focuses on how to make the Agent "loop" continuously around a goal.

OpenClaw's official documentation regards Loop as the "foundation". Image source: OpenClaw

Simply put, the user sets a goal. The Agent first understands the task, then fetches the context, calls tools, observes the results, and judges whether the task is completed. If not, it continues to modify, run, and check. This process is similar to the daily work of humans: make a version, find problems, and then revise it until the result can be delivered.

So, the focus of Loop is not on the word "loop" itself, but on what is actually included in the loop.

The example of Claude Code is the most typical. It doesn't simply connect Claude to the terminal. Instead, it allows the model to repeatedly call tools, edit files, run commands, and observe the returned results in a while - loop. The really complex parts are actually outside the loop: the permission system, context compression, plugins, skills, hooks, sub - Agents, and session storage.

Whether an Agent can run on its own doesn't depend on the model's momentary impulse to think a few more steps. Instead, it relies on a whole set of engineering designs to support it.

This is also the core background of Andrew Ng's discussion on Loop engineering this time. By 2026, Agent products like Claude Code, Codex, ZCode, and MiniMax Code have already made "write - run - check results - revise" their default capabilities.

Anthropic also divided Loops into four types in its blog post: turn - based, goal - based, time - based, and proactive.

- "Help me write a login page." It writes, tests, and modifies. This is a goal - based loop.

- It replies to each message you send. This is turn - based.

- With a time - based loop, you can let it automatically check a certain PR every two hours and review it if there are updates.

- A proactive loop is more aggressive. It can discover problems on its own and take action. For example, if it finds that the test coverage has decreased, it will add test cases on its own.

Goal - based Loop. Image source: Anthropic

The sudden increase in discussions in May and June is also related to the product progress. OpenAI's Codex is no longer just an entry - point to "help you write code". It can read repositories, modify files, run tests in an independent environment, and then submit the logs and results.

On Anthropic's side, Claude Code itself has almost become the best example of Loop engineering. Boris Cherny's words "Stop writing prompts yourself and let an Agent prompt Claude" may sound a bit confusing, but the meaning is that humans no longer need to be responsible for how to ask the model at each step. Instead, humans are responsible for designing the mechanism that allows the model to work continuously.

This is also what ordinary users really need to care about in Loop engineering. The better the Loop engineering, the more an Agent is like a person who can take on tasks. You give it a direction, and it will run forward on its own. If it goes off - track, it can correct itself based on the feedback. After it finishes the task, it will also hand over the process and results to you for inspection.

What's the use of "killing" prompts for ordinary people?

The most direct value of Agent Loop for ordinary users is to lower the threshold of prompt design.

In the past, using AI was like collaborating with a smart but inexperienced intern. You had to tell it how to do each step, when to stop, where to find information, and where not to fabricate. The more detailed your instructions, the better its performance; the more vague your instructions, the more likely it was to go off - track.

An Agent with a well - designed Loop is more like a person who already knows the basic work process. You don't need to remind it every time "If the code reports an error, continue to fix it" because testing and rework are already part of the loop. You don't even need to put all the context into the dialog box at once because the Agent can gradually obtain the necessary information through the file system, search tools, memory, and indexing on its own.

This will change the relationship between users and AI.

In the past, when writing prompts, users often had to play the roles of product managers, project managers, test engineers, and teachers. You had to provide requirements, break down steps, monitor progress, and correct errors. In the future, users will be more like setting goals and accepting results.

For example, if you ask an Agent to make a travel plan. In the past, you might have to write: how much budget, how many days, check flights first, then check hotels, pay attention to transportation, give me a table, and finally summarize. After the Loop is well - designed, you just need to say "Go to Tokyo for 5 days next month with a medium budget, want to avoid hassle and see more exhibitions", and the Agent should be able to check the time, compare prices, arrange the route, find conflicts, and provide a plan on its own. It can also rearrange the plan automatically after you give feedback like "The second day is too busy".

This is the first meaning of the "Kill the Prompt Competition". Ordinary users no longer need to train themselves to be prompt engineers. Agent products should absorb complex processes on behalf of users.

On the other hand, software engineering is naturally suitable for Loops. Goals can be written as issues, the process can be broken down into file modifications, tools can run tests, and results can be verified using diff and CI. If an Agent makes a mistake, the system can immediately see the error report; if it is fixed, it can also see that the test has passed. This feedback loop is clear, verifiable, and can be accumulated. So, Claude Code, Codex, ZCode, and MiniMax Code have all penetrated the code scenario first.

Image source: Zhipu Zcode

But code is just the beginning. Research, spreadsheets, PPTs, data analysis, customer service tickets, legal retrieval, recruitment screening, and operational monitoring all have similar characteristics: the task cannot be completed in one sentence, but the success criteria can be written down, the process can be recorded, and the results can be checked.

This is the second value of Loop: to improve the productivity of complex work. Humans no longer need to monitor every step. Instead, they are responsible for setting the direction, checking the results, and modifying the specifications. The developer feedback loop mentioned by Andrew Ng means that AI can accelerate the internal execution loop, but humans still need to judge whether the direction is correct in a higher - level loop.

In addition, an Agent with a poor experience may give people the first impression of being unstable, randomly using tools, and getting further and further off - track. From an engineering perspective, Loop provides a way to ensure reliability.

Under the Loop engineering design, why an Agent searches a certain page, modifies a certain file, calls a certain tool, and judges that the task is completed can all be recorded. Fixes can become skills, and project rules can be written into files like AGENTS.md, CLAUDE.md, or similar memory files. The next time the Agent performs a similar task, it doesn't need to start from scratch again.

However, it should be made clear here that Loop doesn't automatically bring reliability. In fact, a poorly designed Loop will only allow errors to replicate themselves more quickly.

Conclusion

In the past three years, the way we use AI has undergone several major changes, but the underlying logic remains the same: humans issue instructions, AI executes, and humans then judge the results. Humans have always been at the center of the loop and are the core driving force of the entire system. Loop engineering has, for the first time, moved humans from the center of the loop to the outside. Humans are no longer the drivers but the navigators.

The impact of this change will be more profound than expected. For developers, the core competitiveness lies in the ability to define problems and design acceptance criteria. For products, the iteration speed will further accelerate, forcing product teams to understand users and business better because technology is no longer the bottleneck; judgment is.

Of course, all of these are based on one premise: the model still needs to continue to become stronger. How many loops a Loop can run and how complex tasks it can handle ultimately depend on the basic capabilities of the model. If the model goes off - track after just a few steps, then no matter how delicate the Loop design is, it won't work.

Fortunately, judging from the situation this year, the progress of models has not slowed down. GPT - 5.5, Claude 5, GLM - 5.2, M3, K2.6, DeepSeek V4, all companies have updated their models in just six months, and each generation has significantly improved in Agent capabilities.

The models are getting stronger, the Loops are running more smoothly, and humans are stepping back. This trend is already very clear.

This may sound like just an improvement in efficiency, but if you think about it carefully, it may be a crucial step for AI to further transform from a "tool" to a "collaborator". A tool is something you use, and you need to know how to use it and how to operate it at each step. A collaborator is someone you tell the goal, and it figures out the way on its own, and you work together to get the job done.

We may now be standing at this dividing point.

该文观点仅代表作者本人，36氪平台仅提供信息存储空间服务。

The "Loop Engineering" that Global Agents Are Competing On: AI Does the Work, Supervises, and Revises Itself

From prompts to Harness and Loop, what has Agent gained?

What's the use of "killing" prompts for ordinary people?

Conclusion