The new battle of AWS, the leader in cloud computing: 25 new products were launched in 10 minutes, fully betting on agents.
December 3, Las Vegas. In his first annual re:Invent keynote speech, AWS CEO Matt Garman launched 25 new products in 10 minutes, with nearly 40 new products released throughout the two-hour event. A series of heavyweight products—including a new generation of in-house chips, cutting-edge foundation models, and enterprise-level model customization frameworks—collectively demonstrated AWS's breakthroughs in the computing power layer, model layer, and application layer.
As the world's leading cloud computing provider with an annualized revenue of up to $132 billion, AWS is now at a critical crossroads: the AI boom has lasted for two years, but many enterprises are still trapped in the anxiety of "high investment, low return" and are starting to ask when their huge investments will translate into real business returns.
AWS's answer at this re:Invent conference is: AI must evolve from a chat-only assistant (Assistant) to a truly task-performing agent (Agent).
To seize the opportunity of the shift from model-driven to agent-driven, AWS has chosen a very heavy path: building chips downward, doubling down on cost-effectiveness; thickening models in the middle to break the ceiling of fine-tuning; and setting rules upward to solve the core out-of-control risk of Agent implementation.
Throughout the conference, AWS's strategic focus has shifted from technical breakthroughs to emphasizing enterprise-level value realization, driving AI to truly create value for enterprise customers.
From the intensive releases and the 20,000-word transcript of this conference, we have sorted out AWS's core progress in three dimensions: computing power, models, and applications.
At the computing power layer:
AWS's strategy has become more pragmatic and aggressive: on one hand, it uses in-house chips to significantly reduce costs; on the other hand, it breaks the physical boundaries of public clouds to adapt to large customers who do not want to move to the cloud.
AWS released Trainium 3 UltraServers. Compared to the previous generation, Trn3's inference energy efficiency has increased by 5x. More aggressively, AWS also rareley pre-announced the under-design Trainium 4, promising a further 6x performance improvement. This sends a clear signal to the market: AWS is determined to break free from absolute reliance on external computing power for ultra-large-scale model training.
To address the concerns of financial and government enterprise customers about data sovereignty, AWS launched AWS AI Factories. This is equivalent to building AWS's computing infrastructure directly in the customer's own data center.
Of course, AWS remains NVIDIA's closest ally. Trn3 will be compatible with NVIDIA NVLink Fusion technology in the future. Additionally, the newly released P6e instances are the first to be equipped with NVIDIA's latest GB300 NVL72 system, designed for the most extreme AI workloads.
At the model layer:
AWS finally completed the Amazon Nova in-house model family, launching the Amazon Nova 2 series in one go. Among them, Nova 2 Omni is the industry's first model that supports four inputs (text, image, audio, video) and multi-modal output; Nova 2 Pro excels in complex instruction following, and AWS claims it outperforms GPT-5.1 in benchmark tests.
The biggest pain point for enterprises using large models is that fine-tuning is too shallow and easily makes the model "stupid"—for example, forgetting core capabilities. AWS's Amazon Nova Forge introduces the concept of "open training models", allowing enterprises to inject proprietary data into the final stage of model pre-training.
Sony Group has announced as an early customer to adopt Nova Forge and AgentCore, aiming to increase its compliance review efficiency by 100x.
At the application layer:
Agent is the future core, but its uncontrollable nature makes many enterprises hesitant to use it. AWS attempts to turn Agent into a trusted productivity tool through a strict rule system.
Matt Garman used an analogy: managing AI Agents is like raising a teenager—you need to give freedom while setting bottom lines. To this end, AWS launched AgentCore Policy. Unlike previous vague prompts, this is a deterministic control system based on the Cedar language, which can real-time intercept Agent's non-compliant operations (e.g., blocking automatic refunds over $1,000), solving enterprises' fundamental fear of AI behavior out of control.
For developers, AWS released Frontier Agents. These are not just code assistants but independent digital employees. For example, Kiro Autonomous Agent can autonomously fix bugs; Security Agent can automatically scan for vulnerabilities before code submission; DevOps Agent can even automatically diagnose root causes and provide repair suggestions when alarms go off at midnight. This means the entire software engineering lifecycle is being fully taken over by AI.
Transcript of AWS's 14th re:Invent Conference:
1. AWS Business Overview
Welcome to the 14th annual re:Invent conference. It's great to be here—over 60,000 people are with us in person, and nearly 2 million are watching online, including those tuning in via Fortnite for the first time. Welcome, everyone, and thank you for being here.
Walking through the corridors of Las Vegas, I feel an incredible energy—this aligns with what I've heard from you in recent months. It's been an amazing year: AWS has grown into a $132 billion annualized revenue enterprise with a 20% year-over-year growth rate. To put this in perspective: in the past year alone, our revenue increased by about $22 billion—this absolute growth exceeds the annual revenue of half the Fortune 500 companies.
This growth comes from all aspects of our business:
Amazon S3: Continues to grow—customers store over 500 trillion objects, hundreds of exabytes of data, and process over 200 million requests per second on average.
Computing Power: For the third consecutive year, over half of the new CPU capacity added by AWS comes from our in-house Graviton chips.
AI & Data: Millions of customers use our database services; Amazon Bedrock currently supports over 100,000 AI inference applications.
This year, we launched the first building blocks to help enterprises safely deploy and run high-capability Agents at scale via Bedrock AgentCore. AgentCore has shown extremely strong momentum—its SDK downloads exceeded 2 million in just a few months since launch. Additionally, we released our first quantum computing chip prototype, Ocelot, which is a breakthrough in quantum computing: it not only reduces implementation costs but also improves quantum error correction capabilities by over 90%.
All this starts with a secure, available, and resilient global infrastructure—an area where we remain unrivaled. AWS has the largest and most comprehensive AI cloud infrastructure to date: our global data center network covers 38 regions, 120 availability zones, and we've announced plans to add three more regions. In the past year alone, we added 3.8 gigawatts (GW) of data center capacity—leading the world. We also have the world's largest private network, which grew by 50% in the past 12 months and now has over 9 million kilometers of land and submarine cables—enough to travel between Earth and the Moon more than 11 times.
At Amazon, everything starts with the customer. Today, millions of customers run various use cases on our platform: large enterprises across industries, financial services, healthcare, media and entertainment, communications, and even government agencies—all operating and transforming their businesses on AWS.
For AWS, security is the top priority and the foundation of everything. This is why the U.S. intelligence community has chosen AWS as its preferred cloud service provider for over a decade; Nasdaq migrated its trading market to AWS; Pfizer chose AWS as the core of its digital transformation.
We also know the importance of partners to customer success. We thank our extensive partner network—including many SaaS providers, system integrators, and solution providers present this week—without you, we couldn't serve such a wide range of global customers.
I have a personal soft spot for startups: the number of "unicorn" enterprises built on AWS far exceeds any other platform. Today, more startups—especially AI startups—are flocking to AWS: 85% of the companies on the Ford 2025 AI 50 list and 85% on the CNBC Disruptor 50 list run on AWS. The achievements of these founders are amazing.
(Following is the sharing from the AudioShake team)
AudioShake was the winner of last year's re:Invent "Unicorn Tank" roadshow competition. Imagine if we could extract music, car sounds, or background conversations from a recording of a rainforest, playground, or street music—what would happen?
At AudioShake, we separate sounds so that humans and machines can access and understand them in new ways. Our multi-speaker separator is the world's first technology that can separate different speakers' voices into high-resolution streams. This can be used in call centers to isolate individual voices, and is widely applied in media and entertainment.
More importantly, we see huge potential in the field of hearing and language disorders. We collaborate with non-profit organizations focused on ALS (amyotrophic lateral sclerosis) to use patients' old recordings before onset to separate and clone their original voices—allowing patients to communicate with their own voices.
We started with only three people. Without AWS, we couldn't have obtained the infrastructure needed to deliver models to real customers. We run our entire production pipeline on AWS—from inference and storage to job orchestration and the entire production environment. We are entering a world where sound is more customizable: this not only helps people with hearing impairments listen to the world in the way they want but also helps machines understand the real world more deeply.
Thank you to the AudioShake team for their wonderful sharing. Everything AWS does depends on builders—especially developers. This conference has always been a learning-focused event centered on this. I thank millions of AWS developers around the world, especially the AWS Heroes present here, and the user group community with over 1 million members in 129 countries.
Why do we do this? What motivates us? Why do we still maintain the same enthusiasm as when AWS was founded 20 years later?
The driving force behind us every day is to give you complete freedom to invent. From the moment AWS was founded, this has been our vision: to enable every developer or inventor—whether in a dormitory or garage—to get the technical infrastructure and capabilities needed to build anything they imagine.
20 years ago, this was impossible: developers couldn't get servers or computing power without investing a lot of money and time. You spent too much time purchasing servers and managing infrastructure instead of building products. We experienced this firsthand inside Amazon—we had builders with great ideas, but were limited by infrastructure and couldn't move quickly.
So we asked ourselves: "Why not?" Why can't developers focus on building? Why can't we reduce the time and cost of experiments to zero? Why can't every idea be possible?
Over the past 20 years, we've been innovating to achieve these goals. Today, we are witnessing an explosive wave of invention in AI: every company and every industry is being reshaped. Although the technology iteration speed is unprecedented, we are still in the early stages of the AI revolution.
But I know many customers haven't seen returns matching AI's promises—AI's true value hasn't been fully released. This is changing rapidly: we see "AI Assistants" starting to give way to "AI Agents". Agents can not only answer questions but also perform tasks and automate processes. This is where AI investments start to generate substantial business returns.
I believe the emergence of Agents will bring us to an inflection point in AI development: AI is transforming from a technical wonder to a productivity tool that delivers real value. This change will have an impact on business comparable to the birth of the Internet or cloud computing.
In the future, billions of Agents will emerge inside every company and across all fields. We've already seen Agents accelerate drug discovery, improve customer service, and increase payroll processing efficiency. In some cases, Agents expand human influence by 10x, allowing people more time to innovate.
Wouldn't it be great if everyone could get this level of influence? We think so—this is why we ask ourselves again: "Why not?"
To move towards a future with billions of Agents and allow every organization to get real business results from AI, we must break the feasibility limits of infrastructure. We need to invent new building blocks for Agent-capable systems and applications; we need to reimagine every process and the way you work.
At AWS, we've been innovating at all layers of the stack to give you full freedom to invent the future. To deliver Agents that truly bring value, everything starts with having the most scalable and powerful AI infrastructure: you need a highly scalable and secure cloud platform that provides the absolute best performance for AI workloads. At the same time, you want to achieve this at the lowest possible cost throughout model training, customization, and inference.
Easier said than done—this requires deep optimization of every layer of hardware and software. There are no shortcuts, and this is exactly what AWS can do.
2. AI Infrastructure
When we think about AI infrastructure, the first thing that comes to mind is GPU. Running NVIDIA GPUs on AWS is undoubtedly the best choice. We were one of the first service providers to offer cloud-based NVIDIA GPUs, and our partnership with NVIDIA has lasted over 15 years—this means we have mastered the technology of running GPUs at scale. If you ask anyone who has run large GPU clusters on other providers, they will tell you AWS is the most stable in running GPU clusters: we excel at avoiding node failures and provide the best reliability.
This comes from our attention to detail: even small tasks like debugging BIOS to prevent GPU restarts—we invest effort in them. Elsewhere, people may accept the status quo and think "this is how it works". But we are different: we investigate and find the root cause of every problem, then collaborate with partner NVIDIA to ensure continuous improvement. For us, no problem is too small to pay attention to—these details are crucial and make us lead the industry in GPU reliability. This requires hard work and real engineering skills to achieve, and we are improving on these new dimensions for every generation of products.
This year, we launched the 6th generation P6 EC2 Instances—including the P6E Ultra server with NVIDIA Blackwell processor GB200, whose computing power is over 20x higher than our previous P5n. These instances are ideal for customers using ultra-large AI models.
Today, we are pleased to announce the new P6E GB300, powered by NVIDIA's latest GB300 NVL72. We continue to provide best-in-class computing power for the most demanding AI workloads. Our full-stack approach to hardware and software, combined with rigorous operations, ensures that the world's largest organizations get the absolute best performance and reliability.
The world's largest organizations—including NVIDIA itself (via Project Saba on AWS) and many companies like OpenAI—are actively running large-scale general-purpose generative AI clusters on AWS today. They are using EC