HomeArticle

A cloud that has given birth to numerous unicorns is using AI to implement Agents.

36氪品牌2025-12-03 22:12
From AI infrastructure and inference platforms to data layers and development tools, Amazon Web Services has revolutionized the "all-in-one package" for Agent development.

It is an undeniable fact that all enterprises are embracing AI.

When the animated film Demon Slayer: Kimetsu no Yaiba grossed nearly $800 million at the global box office, Sony, the parent company of the production studio Aniplex, used large models to increase the efficiency of the project compliance review and evaluation process by 100 times.

Adobe, which holds nearly 30% of the creative design software market, has created 29 billion creative assets with its AI creative design generation tool, Adobe Firefly, since its launch this year.

All the content you see on the developer community Reddit is now reviewed and filtered by Reddit's in - house community management AI.

However, behind the scenes where people are unaware, the construction and operation of these agents rely on more precise, large - scale, and complex service systems.

The models of Sony and Reddit were trained on Amazon Web Services' model customization platform, Amazon Nova Forge. The data used to train Adobe Firefly is stored in Amazon Web Services' storage services, Amazon S3 and Amazon FSx.

These stories of embracing AI happen every day on Amazon Web Services' cloud.

In the past year, Amazon Web Services' generative AI development platform, Amazon Bedrock, has served over 100,000 customers globally. More than 50 of these enterprises process over 1 trillion tokens per day. Just four months after its release, the Agent development tool, Amazon AgentCore SDK, has been downloaded over 200 million times. Notably, Matt Garman revealed at the 2025 Amazon Web Services re:Invent conference that the number of unicorn startups born on Amazon Web Services far outpaces others.

The customers' choices are also reflected in the rapidly growing revenue. Over the past year, Amazon Web Services' revenue reached $132 billion, a 20% year - on - year increase. The absolute increase in revenue, $22 billion, exceeds the annual revenue of more than half of the Fortune 500 companies.

For a cloud enterprise that has been around for nearly 20 years, continuously innovating its services is no easy feat.

What is the source of innovation? The answer given at the 2025 Amazon Web Services re:Invent conference is: staying in sync with cutting - edge technologies and customer needs.

This year, Amazon Web Services identified a new direction: Agents.

At the 2025 Amazon Web Services re:Invent, Matt Garman, the CEO of Amazon Web Services, predicted that Agents will be the turning point for the release of AI value.

In the process of delving into customer needs, he observed that Agents are accelerating R & D in the healthcare sector, improving customer service, and enhancing bill - processing efficiency. "In the future, there will be billions of Agents within every company and in every imaginable field. This transformation will have as profound an impact on businesses as the Internet and cloud services."

Under the new opportunities brought by Agents, a vast number of new demands have emerged. On December 1st, the first day of the 2025 Amazon Web Services re:Invent, Amazon Web Services used a crane to lift an old, decommissioned server and blew it up.

This ceremony, which celebrates customers' departure from traditional platforms and the modernization of their systems, also holds special significance for Amazon Web Services today. At the 2025 Amazon Web Services re:Invent, Amazon Web Services made as many as 12 new AI announcements.

The providers and users of AI are achieving a win - win situation in the AI era.

Amazon Web Services' 66 "Why Not"s

After the Amazon Web Services team launched the system modernization conversion service, Transform, in 2024, the sense of accomplishment from solving pain points didn't last long. The team soon sat down to think: Where is the next problem to be solved?

The continuous change in demand is closely related to the rapid transformation of the AI industry. Over the past year, the AI model layer has quickly transitioned from pre - training to post - training paradigms such as reinforcement learning. At the application layer, Agentic AI has quickly become a consensus, aligning model, infrastructure, and application development with it.

"It seems like there's something new every day," Matt said. However, on this thriving track, he witnessed the "other side" of the current state of AI implementation: "When I talk to customers, many haven't seen returns that match the promises of AI. The true value of AI has yet to be unlocked."

Even though Agents are regarded by Matt as a milestone for the true release of AI value, it's undeniable that their implementation is still in its early stages.

In the report The Current State of Artificial Intelligence in 2025: Agents, Innovation, and Transformation released by McKinsey in November 2025, it was mentioned that 32% of enterprises are still in the pilot phase of AI application, and only 7% have truly achieved large - scale implementation of AI.

However, as Agents enter the first half of the implementation boom, a series of new pain points and demands will follow.

The innovation of Agents in productivity lies in their autonomy in planning and execution. But once they cross the line, autonomy becomes out - of - control.

Matt compared the implementation of Agents to raising a teenager: As Agents grow in capabilities, on the one hand, they need more freedom and autonomy to think and learn independently. On the other hand, you also need to set basic rules to ensure their access rights to tools and data.

The deeper you get into the industry's upstream and downstream, the more real problems you'll see.

During his speech at the 2025 Amazon Web Services re:Invent, Matt mentioned "Why Not" at least 66 times. Behind each "Why Not" lies a new industry demand and a new challenge for Amazon Web Services to solve.

For example, developers spend a lot of time purchasing servers and managing infrastructure, squeezing out time for development. Why not reduce the management time and cost to zero?

Every enterprise has a large corpus of intellectual property and business data. However, solutions using external RAG or vector databases don't allow models to truly understand enterprise data. Why not develop a custom model training tool?

In actual implementation scenarios, it's difficult to measure whether an Agent has made the right decision, chosen the best tools, produced correct outputs, or understood the brand. Why not create a tool to evaluate and supervise Agents in real - time?

These "Why Not"s drive Amazon Web Services to push the limits of infrastructure and invent new development modules for generative AI systems and applications. All of this is related to Matt's vision: "Everyone has the freedom to keep inventing."

A Chain for Building Agents Emerges from the 'Why Not's

Making Agentic AI a reality isn't just about the "last mile" of service delivery.

It's a precise chain that spans from AI infrastructure to model training, data integration, and development tools. Any break in this chain could lead to flaws in AI implementation and trigger a trust crisis among enterprises in AI.

Of course, this doesn't mean that humans need to be deeply involved in every link. The self - learning and flexible generalization characteristics of AI allow them to be the builders, debuggers, operators, and managers of AI applications, freeing humans from trivial processes and enabling them to explore more meaningful issues such as how to further unlock the value of AI.

In Matt's view, to build an Agent that can truly bring value to enterprises, four aspects must be considered: AI infrastructure, inference platforms, data, and Agent development tools.

Let's start with infrastructure.

GPUs and servers are the farthest from application implementation in the chain but are of crucial importance. A scalable and powerful AI infrastructure layer can not only reduce the cost of model training, customization, and inference but also provide a safe and stable environment for model training and operation.

As the scale, speed, and intelligence of generative AI implementation increase, the "efficiency, speed, quality, and cost - effectiveness" of infrastructure become even more important. The self - developed AI chip, Amazon Trainium3 UltraServers, launched at last year's re:Invent, amazed the industry with its excellent cost - performance ratio for large - scale AI training and inference, quickly growing into a multi - billion - dollar business.

This year, Amazon Web Services' newly launched server, Amazon Trainium3 UltraServers, has once again "won" the cost - performance competition. This server has increased computing power by 4.4 times, memory bandwidth by 3.9 times, and the number of tokens processed per megawatt of power by 5 times.

In terms of actual operation, when OpenAI's open - source inference model, GPT - oss - 120B, runs on Amazon Trainium2 UltraServers and Amazon Trainium3 UltraServers respectively, with the same interaction latency, the latter outputs 5 times more tokens per megawatt. The same effect is seen in other open - source models.

Of course, optimization is endless. Matt revealed that the design of the next - generation chip, Amazon Trainium 4, has begun. Compared with Trainium 3, Trainium 4 will offer 6 times the FP4 computing performance, 4 times the memory bandwidth, and 2 times the high - memory bandwidth capacity. "All of this is to support the world's largest models," Matt said.

Next, let's look at the inference layer.

In the current context of AI implementation, models remain crucial. As the "brains" of AI applications, the selection and combination of models determine the upper limit of application effectiveness and the lower limit of cost. The competitiveness of MaaS providers essentially lies in having a wider variety of high - performance models on their shelves.

Compared with last year, the number of models on Amazon Web Services' generative AI development platform, Amazon Bedrock, has almost doubled. Among them, the number of Chinese model players has increased from Qwen and DeepSeek to 4. The models of two Chinese large - model startups, Dark Side of the Moon and MiniMax, have also appeared on Amazon Bedrock.

Another trend closely related to the rise of Agents in the model layer is open - source. Open - sourcing models is equivalent to equally opening up the data, computing power, and intellectual resources required for training to developers, enabling them to explore new AI applications at a lower cost.

This year, Amazon Web Services launched a new self - developed open - source model series, Amazon Nova 2, and made it available on Amazon Bedrock. Its three models, Nova 2 Lite, Nova 2 Pro, and Nova 2 Sonic, are suitable for cost - effective scenarios, complex task processing, and real - time anthropomorphic voice - to - voice dialogue scenarios respectively.

Another self - developed model, Amazon Nova 2 Omni, is the industry's first unified multimodal inference and generation model. It supports input in four modalities: text, image, video, and voice, and can generate text and images. This means that the processing of complex modalities no longer relies on the inefficient form of modality conversion but shifts to a more human - like, efficient parallel understanding.

Now, let's turn to the data layer.

In the AI era, the barrier for an enterprise is undoubtedly the business knowledge data accumulated over the long term. However, the prerequisite for building this barrier is to "make the most of data" and build an AI model that truly understands the business.

According to Matt's observation, traditional enterprise - specific model training methods either use enterprise data as an external RAG or vector database or fine - tune or perform reinforcement learning on existing third - party models.

However, the limitations of these methods are obvious. Models with external RAG or vector databases don't understand in - depth domain knowledge or the professional knowledge required for business decisions. Fine - tuned or reinforcement - trained models are not easy to train because they don't have access to pre - training data.

The cost of pre - training a model from scratch based on data is high for most companies. Therefore, Amazon Web Services launched the custom model training platform, Amazon Nova Forge.

"Open - ended training" is a new model customization paradigm pioneered by this platform. By providing exclusive