HomeArticle

At the "American AI Gala", a bucket of cold water was poured on Agents.

阿菜cabbage2025-12-11 13:33
AI infrastructure is not yet ready to accommodate the explosion of Agents.

Text by | Zhou Xinyu

Edited by | Su Jianxun

In December 2025, if you board a plane bound for the United States, you'll most likely encounter two groups of people:

One group consists of doctors who talk about deep learning and attention mechanisms, senior executives from large companies, and investors. Their destination is San Diego, the host city of NeurIPS (Neural Information Processing), the "Oscars" of the AI research community, where they're betting on the most valuable AI research and talent in the future.

The other group is focused on the most practical AI implementation at present. A group of AI entrepreneurs with business cards and their cooperative clients flock to Las Vegas to seek certain AI opportunities at re:Invent, Amazon Web Services' most important annual exhibition.

As for these certain opportunities - after communicating with more than a dozen US-based developers at re:Invent, we found a consensus on both sides of the ocean: The era of Agents has arrived.

Matt Garman, the CEO of Amazon Web Services, announced 12 new AI-related releases, all centered around the infrastructure, development, and management of Agents. In his speech, he made a judgment: The emergence of AI Agents is truly unleashing the value of AI.

An engineer from Amazon Web Services in the US deeply felt the change. At last December's re:Invent, the slogans all over the venue were about AI Cloud and Model as a Service, and only fewer than five vendors like DataDog mentioned Agents.

But this year is different. "If you toast every exhibitor who claims to be working on Agents, even if they're not actually doing it," he joked, "you'll be drunk halfway through."

△ Inside the venue

△ Inside the venue

However, in contrast to the intoxicating "Agent fever", there is a sense of calm among most US developers.

"I'm here to pour cold water. Whether it's from the perspective of cost or AI-first capabilities, I think the current infrastructure for Agents is still very weak," Huang Dongxu, the co-founder and CTO of the database service provider PingCAP, who has been based in Silicon Valley for many years, told Intelligent Emergence.

The aforementioned Amazon Web Services engineer shares a similar view. "The development speed of Agents is disruptive," he said. When vendors see the development efficiency of Agents, which is millions of times that of humans, their demand for Agent development is increasing exponentially.

This poses huge challenges to the computing power for training and inference, as well as the software and hardware for data storage (as resources for training Agents). "Currently, the industry has gradually shifted from a shortage of GPUs to a shortage of memory." He told Intelligent Emergence.

Under the huge pressure of inference costs, Develop for Cost (developing to reduce costs) has become a new competitiveness evaluation system in the US Agent startup circle.

Limited by the model's inference ability, to complete complex tasks or process long texts, Agents often need to call "Pro"-level models - which also leads to high calling costs.

Zhu Zheqing, the founder and CEO of Pokee.AI, once publicly stated that on average, 80 - 90% of the costs of AI Agents in the market are for inference. He mentioned that AI application companies can only make real profits if they can reduce the inference cost by 80%.

"A common question that VCs ask Agent startups now is: What's the inference cost? Can the subscription fees cover the inference cost?" a US Agent entrepreneur told us at re:Invent.

Another fundamental question from developers about Agents is: Are the software products in the market ready to be called by Agents?

"The current software ecosystem is developed for humans, not for AI Agents," a Code product manager from Anthropic told us.

As a veteran in the database industry, Huang Dongxu shares the same view. "Humans and AI have different preferences for software usage."

For example, AI dislikes data silos that require more Tokens to connect, and when faced with numerous calling interfaces, AI is prone to "out - of - control situations" such as hallucinations and reduced intelligence due to distracted attention.

Huang Dongxu believes that vendors need to launch a software revolution based on the concept of "for Agent use":

First, on the software interaction interface, Agents should be able to express flexible requirements in the simplest way, such as designing a database interaction language similar to SQL for AI; second, avoid creating data silos; third, control costs.

However, "pouring cold water on Agents" also means that there is still much room for iteration and optimization, as well as business opportunities, for the model layer, Infra layer, and data layer vendors that serve as Agent infrastructure.

A new wave of AI infrastructure investment is sweeping through Silicon Valley. Optimizing AI Infra to reduce the inference cost during model calls is becoming a new trend in Infra startups.

For example, in September 2025, NVIDIA was reported to have spent over $900 million to obtain the technology license of AI Infra startup Enfabrica and hired its CEO. A database exhibitor told Intelligent Emergence that he's currently planning to invest in several AI Infra projects led by Chinese - American entrepreneurs in the US.

Another important aspect of Agent infrastructure, data, is also attracting attention. At the re:Invent exhibition, database vendors such as Snowflake, MongoDB, and Databricks occupied half of the exhibition space. An employee from Snowflake told us that data determines an Agent's understanding of business and scenarios.

Therefore, the new propositions for database vendors are, first, to find a database form that can interact more effectively with Agents, and second, expand capacity to be ready to meet the explosive demand for Agent development.

"Agents are not a bubble at present," an employee from Anthropic summarized. "But if everyone chases the most obvious applications and no one builds the matching infrastructure, it will become a bubble."

△ The Anthropic booth was crowded with audiences listening to the sharing. Photo source: Taken by the author

Welcome to communicate!