With the explosion of Agent applications, who will be the driving force behind their upward development?
After more than two years of development, AI has begun to accelerate its entry into the Agent era.
As AI evolves from "passive response" to "active decision-making," AI Agents are becoming the core hub connecting the digital and physical worlds.
From enterprise Agents that automatically handle customer service tickets, to academic Agents that coordinate multi-step scientific research experiments, and then to personal Agents that manage smart home ecosystems, these intelligent entities with the abilities of reasoning, planning, memory, and tool use are reshaping industrial forms.
However, what supports their intelligence is a complex and sophisticated infrastructure - it not only includes algorithms and models but also encompasses the full-life cycle support system from R & D to deployment, from collaboration to operation and maintenance.
In 2025, the AI Agentic infrastructure (Agent Infra) reached an inflection point of explosive growth. Breakthroughs in open-source large models such as DeepSeek and Qwen have provided a powerful cognitive "brain" for Agents, while the prosperous ecosystem of the Model Context Protocol (MCP) has endowed them with flexible "limbs."
According to IDC's prediction, 80% of global enterprises will deploy Agents within the year. The co-evolution of the "brain" and "limbs" is forcing the "trunk" that carries them to undergo a comprehensive upgrade, and Agent Infra has become the core battlefield for technological breakthroughs.
Enterprise Applications of Agents
Face Five Major Pain Points
Products that use AI capabilities to automate work processes have existed for a long time. Before the emergence of generative AI, RPA-like products were very popular.
However, limited by the relatively weak AI capabilities at that time, RPA could only automate simple and single work processes, lacked real intelligence, and could not solve complex and compound problems.
It was not until the emergence of generative AI and the appearance of various truly intelligent Agent applications that people began to achieve significant efficiency improvements from AI automation.
In essence, an Agent is an AI that can call various tools. For example, Manus uses prompt words to control an AI model and orchestrates an elaborate workflow to enable the AI model to use various tools and then complete a complex task.
However, whether it is research-oriented Agent applications represented by DeepResearch or general-purpose Agent applications like Manus, they are all provided to end-users through web pages or apps.
This way of provision is not suitable for professional AI developers, AI entrepreneurs, and enterprise users. Their needs are to allow Agents to use proprietary data and seamlessly integrate into business operations to continuously provide value for the business.
When using Agents commercially, the first problem encountered is terminal performance. When a powerful Agent runs on a user's local terminal, various problems will arise.
First and foremost is the computing power limitation for AI inference. An Agent consists of a powerful AI model and a set of toolchains for it to call.
Running a powerful AI model usually requires specialized AI computing power provided by GPUs or AI-specific chips. Almost no consumer-grade PCs or mobile phones can deploy high-precision large model bodies. Therefore, currently, a large number of Agent companies use cloud computing power, completing both model training and inference in the cloud.
Secondly is the computing power for task execution. Agent tasks are characterized by high concurrency and high computing power requirements. After an enterprise deploys an Agent locally, when the business volume supported by the Agent begins to grow rapidly, more computing power is immediately needed, and the speed of local deployment cannot keep up. Conversely, when the business is idle, there is not so much computing power demand, which will cause a huge waste of resources for the enterprise.
For example, in the early days, Manus used virtual machines on local servers to perform tasks, which led to performance issues and unstable services when a large number of users flocked in, affecting its early reputation to some extent.
Thirdly, configuring AI tools is troublesome. If an Agent cannot call tools, it will be difficult for it to have the ability to solve complex problems.
For example, to build a sales Agent, it needs to call the CRM to obtain customer information, call the internal knowledge base to automatically introduce products to customers, and also call various communication tools to directly reach customers.
There are already many intelligent computing centers across the country, which can initially alleviate the problem of computing power limitation. However, these intelligent computing centers only provide computing power and do not provide the various toolchains needed to build Agents.
If an enterprise wants to customize an Agent closely coupled with its business, it needs to build its own toolchain. This is a very complex project. On the one hand, it requires high development costs. On the other hand, it takes a considerable amount of development time before the Agent is officially deployed, which will actually slow down the enterprise's business development speed.
After solving the problems of computing power limitation and AI tool configuration, professional AI developers and enterprise users will immediately encounter the third problem, which is permission conflicts.
The purpose of developing and deploying an Agent is to integrate it into one's own business. In this process, in addition to calling various tools, it also needs to closely cooperate with various software in the business.
Taking a sales Agent as an example, when it calls the CRM, internal knowledge base, and external communication tools, it not only occupies local computing resources but, more problematically, it will seize the access and operation permissions of human employees.
When Agents do not cooperate with humans but instead consume each other, it may actually reduce the overall work efficiency of the entire team.
For enterprise users, there is also a major problem, which is poor security. The purpose of enterprises using Agents is to enhance their business or improve employee efficiency, which inevitably requires the use of the company's internal data.
However, the execution of Agent tasks is a black box, and the execution process is opaque to users. It may modify or delete files in the local computer's file system. At best, it may leave junk files and cause system bloating; at worst, it may cause file loss or data leakage.
Furthermore, in fact, Agents themselves pose security risks when calling tools.
Statistics show that more than 43% of MCP service nodes have unauthenticated Shell call paths, more than 83% of deployments have MCP configuration vulnerabilities, and 88% of AI component deployments do not enable any form of protection mechanism at all.
As the use of Agents becomes more and more popular in the future, the importance of security and trust is even more crucial in the AI era than in the Internet era.
After actually using locally deployed Agents, enterprises will also face the problem that Agents lack long-term memory.
When lacking semantic memory and scenario memory, an Agent can only complete one-time tasks, which will seriously affect the scope of its use in enterprise business.
When enterprise users apply Agents to their business, if they can endow Agents with long-term memory, then in addition to being able to complete multiple tasks, enterprises can also iterate Agents based on these memories, enabling them to have a deeper understanding of the business or users and become more capable in specific tasks.
Agent Infra Arrives with the Wind
Nowadays, cloud providers are competing to launch a new generation of Agent Infra technology architectures.
For example, AWS launched AgentCore (preview version), a fully managed runtime that is deeply customized and optimized based on the Lambda FaaS infrastructure, which solves key limitations of standard Lambda for Bedrock Agents, such as long-term execution, state recording, and session isolation.
Azure launched the AI Foundry Agent Service, which integrates Functions FaaS event-driven capabilities, enabling the Agent Service to leverage the event-driven, scalability, and flexibility of Serverless computing to more easily build and deploy Agents.
Google Cloud launched the Vertex AI Agent Builder. Although the official has not clearly stated it, it is generally inferred that it highly relies on and optimizes Cloud Run (Cloud Functions 2nd Gen has been built based on Cloud Run) to support long-term operation, concurrency, and stateful requirements.
Alibaba Cloud launched Function AI for function computing. The official clearly stated that it is deeply optimized based on the Serverless x AI runtime of FC FaaS and launched model services, tool services, and Agent services. Developers can independently choose one or more of the models, runtimes, and tools to build and deploy Agents in an assembled design.
PPIO launched the first domestic Agentic AI infrastructure service platform - the AI intelligent entity. This AI intelligent entity platform product is divided into a general version and an enterprise version.
The general version is supported by a distributed GPU cloud base, and it released China's first Agent sandbox compatible with the E2B interface, as well as a model service more suitable for Agent construction.
The Agent sandbox is a cloud-based secure operating environment designed specifically for Agent task execution. It supports the dynamic calling of various tools such as Browser use, Computer use, MCP, RAG, and Search, endowing Agents with safe, reliable, efficient, and agile "hands and feet." Currently, this sandbox has been connected to well-known open-source projects such as Camel AI, OpenManus, and Dify.
These technologies all point to the same goal - to provide Agents with a "trunk" with higher elasticity, lower latency, stronger security, and longer sessions, supporting their transition from the laboratory to millions of enterprise scenarios.
When cognition and action form a closed loop, the technological gap in Agent Infra will determine the speed and quality of enterprise AI innovation and transformation.
The evolution of the Agent development paradigm has put forward new requirements for the underlying infrastructure.
The new generation of Agent Infra from major cloud providers focuses on technological breakthroughs such as long-term operation, session affinity, session isolation, enterprise-level IAM and VPC, and model/framework openness, which in essence aim to meet the common needs of three core Agent forms.
First is the strong demand of LLM Agents for continuous tool calling. LLM Agents need to continuously call toolchains to complete complex reasoning, which may span several minutes or even hours.
The execution time limit of traditional Serverless (such as the 15 - minute upper limit of AWS Lambda) will forcibly interrupt tasks. Therefore, the new generation of Agent Infra must break through this limit and support long-term operation.
At the same time, to maintain the consistency of context in multi - round conversations, session affinity is required to ensure that the same request is routed to the same computing instance, avoiding state loss.
Secondly, Workflow Agents rely on state management. Automated workflows (such as data processing Pipelines) need to persistently record execution states.
The stateless nature of traditional Serverless cannot save intermediate results, while the new generation of Agent Infra ensures the atomicity and recoverability of workflows through stateful sessions. Session isolation ensures that tasks do not interfere with each other in multi - tenant or high - concurrency scenarios, meeting enterprise - level security and compliance requirements.
Third is the flexibility and ecological integration of Custom Agents. Custom Agents need to integrate heterogeneous tools (APIs, domain models, databases, Code Interpreter, Browser Use, etc.), which requires the new generation of Agent Infra to support model/framework openness (such as AutoGen, LangChain, AgentScope).
A closed architecture will limit the ability expansion of Agents, and cloud providers can provide plug - in integration interfaces by decoupling the computing layer and the framework layer.
The new generation of Agent Infra retains the core advantages of Serverless (fully managed, no need for operation and maintenance, lightweight, elastic, and more economical) while solving the core needs of LLM Agents for continuous reasoning, Workflow Agents for complex state transitions, and Custom Agents for flexible customization through key functions (long - term operation, session affinity/session isolation) and technological breakthroughs (state persistence, cold start optimization, open integration).
This marks that Agent development is shifting from "manually piecing together traditional components" to a new technological path of "using native Infra to achieve efficient, secure, and scalable development and deployment."
As the application of Agents accelerates further, Agent Infra has become an area actively explored by model companies, cloud providers, and startups. In addition to cloud giants, startups also have significant opportunities in this field.
First, find links with AI - native requirements in the existing Infra. This requirement can be that Agent development places higher demands on certain performance aspects of the link. For example, the Sandbox needs a faster cold start speed and stronger isolation. This requirement can also be the need for better integration with AI workflows and more AI - native feature points, such as adding RAG functionality or better integration with certain languages or SDKs commonly used by AI developers.
Secondly, seize new pain points in Agent development. Agent development aims for a high ROI in R & D and time investment and has a great demand for Infra products that can lower the development threshold and workload. Therefore, an easy - to - use and reasonably priced Infra has the opportunity to be widely adopted. Moreover, the Agent ecosystem is an ecosystem that emphasizes co - construction, and the continuous innovation of Infra is strongly promoting the construction of this ecosystem.
When developing an Agent becomes as convenient as assembling Lego bricks and when the Agent collaboration network penetrates every corner of society, we will no longer debate "whether this is a trend or a bubble" because this is a new future that is approaching.
This article is from the WeChat official account "Technology Cloud Report" (ID: ITCloud - BD), author: Technology Cloud Report. 36Kr published it with authorization.