Behind Qianwen Agent, Doubao Phone, and Skills, a blue ocean for "selling shovels" to Agents is taking shape.
Agents are regarded as the native application form in the AI era and are highly anticipated. Gold rushers in the AI era will also flock to this field.
Since the end of 2025, Agents capable of "getting things done" have attracted a great deal of attention. Examples include the Doubao phone, which was resold for up to 10,000 yuan in the second - hand market, the general intelligent agent startup Manus, which was acquired by Meta for billions of dollars, and Qianwen Agent, which announced its integration with more than 400 products in Alibaba's entire ecosystem and can complete various long - range actions such as ordering takeaways and tickets directly.
These carefully crafted intelligent agent products have captured the public's attention and sparked discussions about the transformation of AI from dialogue to action. Under the spotlight, the open - source community is also witnessing a bustling wave of "hand - crafted intelligent agents".
Tech geeks use flexible toolkits like Claude Skills to assemble "digital employees" that fit real - world business needs or create Agents specifically designed to optimize workflows. It's like cultivating a dedicated digital partner in the AI era. This sense of control and achievement from scratch is truly appealing.
A senior analyst in the technology industry summarized: From the perspective of the Agent ecosystem, the business in the market is divided into three levels: the "base - level capabilities layer", the "AI - oriented business layer", and the "human - oriented business layer". This is an exciting qualitative change that contains great room for imagination.
Among them, the most noteworthy is the "machine (AI) - oriented business layer". This is a brand - new market segment that did not exist in the PC Internet era or the mobile Internet era. In this layer, an infrastructure market specifically serving Agents is rapidly taking shape. Those "shovel - selling" players who provide search APIs and identity authentication for Agents have quietly entered the market to seize this new blue ocean.
01 Starting from the Popular Claude Skills
In mid - January, with the emergence of Claude Cowork, Skills completely broke through the developer circle. The SKILL.md, which was originally only active in the command line, has now entered the desktops of ordinary office workers.
When you throw a stack of reimbursement receipts at it, it will silently activate the "Expense - Audit" skill, automatically retrieve OCR, verify tax numbers, and generate reports. This "plug - and - play" experience has made Skills the most coveted "cheat" for office workers in 2026.
The cross - circle Skills actually became popular among developers as early as late October 2024. At that time, Anthropic released the Claude Code terminal tool and the amazing Computer Use function, enabling AI to operate computers and write complex code.
However, new problems also emerged. Developers found that although the Computer Use function gave AI "hands and feet", the AI lacked professional SOPs (Standard Operating Procedures) for specific tasks in its "brain". To make it write a React component, you had to write a long prompt every time to teach it.
Claude Skills came into being.
Its logic is extremely simple: encapsulate professional knowledge in a SKILL.md file to achieve "on - demand loading" of instructions. These rules don't consume Tokens usually. Only when Claude recognizes a matching task will it read this Skill like loading a driver.
The open - source community has given high praise to Claude Skills, calling it the "professional wrapper in the low - code era". Since it is based on the Markdown format, even non - senior programmers can create a Skill as long as they can understand the workflow.
Currently, in the GitHub ecosystem, the most - watched projects are concentrated in two directions. The official AnthropicsSkills, as the guiding repository, includes standard skills for high - frequency scenarios such as PDF conversion, in - depth code analysis, and Excel automation. It serves as a benchmark for all developers to learn from and reference.
Among the civilian open - source projects, ObraSuperpowers maintained by Jesse Vincent quickly became popular in early 2026 with its unique automated execution framework and "self - reflection" - style instruction set. It has become the third - party skill toolkit with the fastest - growing star count and is most favored by geeks in the community.
Jesse Vincent is a senior open - source software architect and one of the first to realize the great power of the SKILL.md file format. Through a series of blog posts (such as "Skills for Claude!"), he demonstrated to global developers how to "inject" complex human professional experience into AI through a simple Markdown file.
The Superpowers 4 version he released at the end of 2025 introduced an independent code review agent mechanism, which directly promoted the wave of "AI agent autonomy" in early 2026. In the eyes of many geeks, Jesse's work has set a template for the "professional Standard Operating Procedures (SOP)" in the AI era.
In just over a year, there are already 60,000 Claude Skills in skillsmp.
The domestic leading AI self - media, Digital Life Kazik, commented on Skills as follows: "The value of Skills lies in reuse. Tomorrow you'll want to create a second one. The day after tomorrow, you'll want to transfer all your processes into it. By then, you'll enter a different state, a state of freedom and creation."
Claude Skills has opened up a new paradigm: there are also huge market opportunities in the infrastructure layer of Agents. This means that "one - person companies" or "super individuals" can independently create reusable skill packs for Agents. This field has great potential and is expected to evolve into a highly diverse ecosystem like the App Store in the mobile Internet era.
02 The Preliminary Formation of the Agent "Shovel" Ecosystem
The popularity of Skills breaking through the circle in early 2026 only reveals the tip of the iceberg of the entire intelligent agent economy. Its ecological potential is huge, but there are many more crucial components in the entire ecological foundation.
What is the complete framework of an Agent? Lilian Weng, a researcher at OpenAI, defined an Agent as an intelligent system with a large - scale model as its "brain", which can plan and break down goals, accumulate experience through memory, and expand its boundaries with tools to autonomously execute complex tasks. This is a widely recognized standard framework in the industry.
The "brain" layer is currently the territory of basic model players. However, the future competition is full of uncertainties because the industry is facing a new question: Is an end - to - end large - scale model itself an intelligent agent? We will focus on analyzing this controversial point in the last part of the article.
With a brain, an Agent also needs to learn how to plan, which is the art of breaking down complex goals into executable steps. In 2026, intelligent agents generally have the ability of "reflection and self - examination". Like humans, they will check their outputs. When they find that the search results don't match, they will automatically correct the path through models like ReAct.
Although orchestration frameworks like LangGraph or CrewAI are invisible to ordinary users, as the built - in "meta - tools" of Agents, they monitor and optimize the execution path to ensure that tasks won't get stuck in logical dead - ends.
The key to enabling an Agent to provide personalized services to users lies in "Memory". In addition to short - term memory for recording current conversations, long - term memory built through RAG (Retrieval - Augmented Generation) technology has become a standard feature. Using vector database APIs like Pinecone or Milvus, Agents can retrieve information from a vast amount of historical documents at any time.
The most prominent trend in 2026 is the rise of "personalized profiles" - Agents not only remember your work preferences but also can read your Google Drive or database records across platforms through MCP, forming a warm and personalized digital memory that is not "forgotten after reading".
What truly makes an Agent stand out is its "hands and feet", that is, "Tools and Execution (Action)". This is currently the most prosperous tool layer, with a large ecosystem ranging from the Tavily dedicated search API to the Zapier automation integration platform.
Especially with the emergence of Claude Skills, it encapsulates complex workflows into reusable capability packs. Combined with the MCP standard protocol, Agents can seamlessly connect to GitHub, Slack, and even local Docker environments, directly operating more than 8,000 SaaS applications.
In 2026, Anthropic complemented the missing interaction logic in the MCP protocol through Advanced Tool Use. By using tool search, usage examples, and programmatic calls, it solved the problems of perceptual overload and decision - making paralysis when Agents face a large number of tools.
Now, the process of building a production - level Agent is as standardized as assembling Lego bricks. Simply put, developers first define a clear "Persona" to set the behavior style; then build an "Environment" and equip it with various external APIs through the MCP connector; next, inject "Knowledge" by storing business documents in a vector database; then design the "Orchestration" logic with an orchestration framework; finally, set up a security filtering mechanism called "Guardrails" on the outermost layer to ensure that the Agent's behavior is always safe and controllable.
Note: The above hierarchical framework is compiled based on industry analysis and public information. There is currently no unified industrial standard.
The prototype of an intelligent agent ecosystem has begun to take shape. However, the industry is still in its early stage, and there are many unclear areas and problems to be solved.
03 Disagreements in the Intelligent Agent Ecosystem
The first disagreement lies in the essence of an Agent: is it helping people complete processes or becoming another kind of "person"?
One school of thought, represented by players like Dify, n8n, or LangChain, focuses on "Workflow Orchestration". In this paradigm, developers are like experienced watchmakers, carefully adjusting the nodes in LangGraph, thinking about what to do in the first step and what in the second.
The other school of thought believes that Agent creators need to define the boundaries and behavioral rules of atomic operations like building a physical world and then embrace the possibility of "magic" emerging from the uncertainty of Agents.
Just like the pre - made toolbox provided by Manus through Sandbox, although the Agent receives human instructions, it independently discovers and combines atomic operations in the environment.
This perspective holds that a real Agent should "improvise" in the environment like a human, rather than "looking for a sword where the boat was marked" on a pre - set process track.
This difference in perception directly extends to the underlying logic of the intelligent agent economy: What is the smallest settlement unit in this ecosystem? One school believes in "breaking down" large goals into countless small subtasks (Tasks), thinking that as long as the breakdown is fine - grained enough, the Agent can execute them accurately.
However, since the Agent's brain is based on the Transformer - architecture large - scale model, it is fundamentally impossible for an Agent to achieve 100% accurate breakdown.
Therefore, another trend is turning towards the "Intent Protocol" centered on "generation". In this concept, intent is the smallest economic unit, and an Agent doesn't need to focus on complex intermediate steps but directly drives the release of capabilities through intent.
Mingke, an entrepreneur in the Agent field, built the Agency Framework: "In my framework, an Agent is the interface for all capabilities, and other capabilities are exposed to end - users through this interface, including Access control, identity verification, and state management (such as whether the goal is achieved)."
In this path, it means that future users may no longer be aware of specific apps or databases. They will only interact with the entire digital world through the "thin veil" of an Agent. From a traditional thinking perspective, this paradigm poses a fundamental challenge to the existing Internet ecological architecture.
This is why although the Doubao phone has great ambitions, it faces many obstacles in actual use; Qianwen Agent can relatively completely complete tasks assigned by users within Alibaba's own ecological framework.
In the traditional world, trying to establish new game rules is bound to be difficult.
In the eyes of many Agent entrepreneurs, to enable an Agent to smoothly fulfill an intent, a new way in a new world is needed.
In this concept, as mentioned above, the smallest unit of the Agent economy is "intent". Intent is defined as the "desired state".
After obtaining this state described by the user, the Agent converts the vague natural language into a series of executable plans.
For example, when a user says they want to go on a business trip to Beijing, it actually includes a series of intents such as booking tickets, reserving a hotel, and checking the weather. Ordering a cup of milk tea is also a kind of intent.
Future business settlements will be based on paying for the result of "fulfilling the intent" for users.
However, there is a huge gap between "understanding the intent" and "delivering value": How can intent be standardized? What does it mean to truly fulfill an intent? Currently, Agents are still like isolated islands from each other, and there is no common intent expression protocol in the ecosystem.
This means that Agents cannot directly settle value through intent like the Visa or Swift systems handle currency.
It is obvious that Agents still have a long way to go before they can truly achieve value interconnection.
04 Where is the Boundary Between Models and Intelligent Agents?
Another core issue that has always been controversial is: if a basic model with Agent capabilities can achieve Agent functions end - to - end, where is the boundary between it and an Agent? If an end - to - end model can handle everything, can the intelligent agent ecosystem still develop healthily?
Mingke explained, "The essence of an LLM is only Next Token Generation. What happens to the generated token to turn it into an action that can affect the environment is beyond the scope of the model.
Next Token Generation is the only thing an LLM can do. Although they are collectively called models after productization, in fact, many things are added outside the model. For example, Claude has started to become an Agent, and there are many things stuffed between the core model, Claude Code, and Cowork, such as virtual machines, which do not belong to the model itself."
Therefore, if we don't consider the technical implementation path, end - to - end models with Agent capabilities and Agents are actually the same species and both rely on the Agent ecosystem.
Currently, the gold rush in the Agent economy has begun. Compared with the chaotic competition in terminal applications, the "infrastructure layer" that provides standardized capabilities for Agents, that is, those indispensable "shovels", has evolved into a highly certain and explosive track.
But this time, the "shovel - sellers" are definitely not in the supporting role.
This article is from the WeChat public account "Tencent Technology". Author: Guo Xiaojing, Editor: Xu Qingyang. Republished by 36Kr with permission.