Jensen Huang's 10,000-word speech at GTC 2026: In the era of AI factories, 80% of applications will disappear. Why is OpenClaw the next Linux?
"Welcome to GTC!"
When Jensen Huang stepped onto the stage in his iconic leather jacket, the entire venue erupted. But this time, he did not just launch a new chip—he painted a vision of an entirely new world: a future built by AI factories, token economy, and intelligent agents. In this future, most traditional applications will disappear, data centers will transform into token production factories, and the open-source project OpenClaw is emerging as the operating system of this new world.
Let's recap this nearly three-hour keynote and break down the technology blueprint Jensen Huang laid out for 2026 and beyond.
1. 20 Years of CUDA: Flywheel Effect Accelerates Growth
At the start of the keynote, Huang looked back at NVIDIA's foundation—CUDA. This year marks the 20th anniversary of CUDA. What began as an architecture few initially believed in now boasts hundreds of millions of installations. From programmable shaders to RTX, to the AI explosion, CUDA's flywheel effect continues to accelerate: a massive user base attracts developers, developers create breakthrough algorithms, algorithms spawn new markets, and new markets expand the user base further.
"Downloads of NVIDIA libraries are growing at an astonishing rate, larger than ever before," Huang emphasized. It is this flywheel effect that gives NVIDIA GPUs an extremely long lifespan and broad applicability, covering the entire AI lifecycle from data processing to scientific computing, from training to inference.
2. The Inference Turning Point: AI Begins to Think
"Computing demand has increased 1 million times in the past two years," Huang shared a staggering figure. The reason lies in the leap of AI capability: from ChatGPT opening the era of generative AI, to the O1 model gaining reasoning ability, to Claude Code becoming the first agent model capable of working autonomously. Every advancement means exponential growth in computing volume during the inference stage.
"AI needs to think now." Huang pointed out that thinking requires inference, and inference requires generating a large number of tokens. Compared to training, the computing demand for inference has increased by around 100,000 times. This is exactly the inference turning point—AI has moved from "perception" to "generation", and from "reasoning" to "action".
This turning point has brought staggering market demand: in 2026, NVIDIA's Blackwell and Rubin product lines have already secured $500 billion in orders, and by 2027, this figure will reach at least $1 trillion.
3. New Hardware Launch: Vera Rubin and Groq Integration
On the hardware front, Huang launched the new-generation AI supercomputing platform Vera Rubin. The platform includes the Vera CPU, Rubin GPU, NVLink-72 interconnect, and all-new storage and networking systems. Compared to Hopper, Vera Rubin delivers a 35x improvement in token throughput at the same power consumption.
What is even more notable is that NVIDIA announced a deep partnership with the Groq team, integrating Groq's LPU (Language Processing Unit) into the Vera Rubin system. Groq chips use a deterministic data flow architecture and a massive SRAM design, optimized specifically for ultra-low latency inference. The combination delivers another 35x performance improvement for inference at the highest value tier.
"We are building a Kyber rack housing 144 GPUs, connected via copper cables, delivering unprecedented scaling density," Huang demonstrated the Rubin Ultra compute node on site. Its size is so large that it required stage machinery to assist with lifting it into place.
4. AI Factories: From Data Centers to Token Factories
Huang put forward a core concept: Future data centers will no longer be places to store and process data—they will be "AI factories", and their product is token. Every AI factory is constrained by power: a 1-gigawatt factory can never become a 2-gigawatt factory, so the number of tokens produced per watt has become the key metric.
"This is your future revenue curve," he said as he presented a 2D chart, with token throughput on the vertical axis and inference speed (interactivity) on the horizontal axis. Different tiers of service correspond to different pricing: free tier, mid-tier service, premium research service. By optimizing co-design of hardware and software, NVIDIA can shift the entire curve upward, allowing customers to generate more than 5x the revenue with the same amount of power.
To this end, NVIDIA launched Dynamo, an operating system designed specifically for AI factories, and the DSX platform, a digital twin blueprint for designing and operating AI factories that integrates a full toolchain from mechanical simulation to power grid optimization.
5. OpenClaw: Open-Source Operating System for Agent Systems
During the keynote, Huang dedicated a large portion of his talk to an open-source project: OpenClaw. This personal AI agent, developed by Peter Steinberger, became the most popular open-source project in human history in just a few weeks, surpassing 30 years of growth for Linux.
"What is OpenClaw? It is an agent system that can call large models, access tools and file systems, break down tasks, spawn sub-agents, and interact with you in a variety of ways," Huang explained. He believes OpenClaw is essentially an "operating system for intelligent computers": just as Windows ushered in the PC era, OpenClaw will usher in the era of personal agents.
Every company now needs to develop a "OpenClaw Strategy". To support this, NVIDIA launched the NemoClaw reference design, which integrates enterprise-grade security, privacy protection, and policy engines, allowing enterprises to deploy agent systems securely. At the same time, NVIDIA released multiple cutting-edge open models, including Nemotron, Kosmos, ALPAMIO, GROOT, and more, covering fields including language, vision, physical AI, autonomous driving, and others.
"Every SaaS company will become an Agent-as-a-Service company," Huang predicted, that every engineer will have an annual token budget in the future, using AI to amplify their capabilities.
6. Physical AI: Robotics and Autonomous Driving
The final section of the keynote focused on physical AI—robotics. Huang announced four new partners for NVIDIA's autonomous robotaxi platform: BYD, Hyundai, Nissan, and Geely. Combined with previous partners Mercedes-Benz, Toyota, and General Motors, the platform covers a total of 18 million vehicles produced annually.
In the robotics field, NVIDIA partnered with Disney and DeepMind, training a character robot that can walk and interact based on the Newton solver and Kosmos world model. On stage, an Olaf the snowman robot walked out to interact with Huang, demonstrating the latest progress in physical AI.
"The world's first large-scale deployment of physical AI is here," Huang summarized. From autonomous driving to industrial robots, from operating room assistance to entertainment characters, physical AI is moving from simulation to reality.
The three-hour keynote was extremely dense with information. The core message Huang delivered is clear: we are at a fundamental turning point in computing paradigm—moving from retrieval-based computing to generative computing, from data storage to token production, from application software to intelligent agents.
In this new world, hardware is the foundation of AI factories, software is the soul of agent systems, and open-source ecosystems like OpenClaw are the glue that connects everything. As Huang put it: "The future is here, why don't you come see it for yourself?"
For developers, entrepreneurs, and everyone following technological change, the signal from GTC 2026 couldn't be clearer: The era of AI factories is here, tokens will become the new currency, and your "OpenClaw Strategy" will determine your position in the next decade.
Full Transcript of Jensen Huang's NVIDIA GTC 2026 Keynote:
Welcome to GTC! This is a technology conference, and we're here to talk about technology, and talk about platforms.
NVIDIA has three platforms. You probably think we mainly talk about one of them, which is related to CUDA X. Our systems are another platform, and now we have a new platform called AI Factory. We will talk about all of these. Most importantly, we will talk about the ecosystem.
Thank you to the hosts of the pre-show, they did an amazing job. Sarah Guo, Alfred Lin, Gavin Baker—these three know technology incredibly well, and have a perfect grasp of everything that's happening. Of course, they also cover an extremely wide range of the technology ecosystem. And I want to thank all the VIPs I personally selected for today—this all-star team, thank you all for being here.
Thank you to all the companies that are here today. As you know, NVIDIA is a platform company. We have the technology, we have the platforms, we have a rich ecosystem. Today, this entire $100 trillion industry is all gathered right here. This event has 450 sponsoring companies, 1,000 technical sessions, and 2,000 speakers. This conference covers every layer of the five-layer AI stack. From infrastructure including land, power, and facilities, from chips to platforms, to models, and of course, ultimately all the applications are what make this industry take off. It all starts right here.
This year marks the 20th anniversary of CUDA. We have been developing CUDA for 20 years. This revolutionary invention—Single Instruction Multiple Thread, where you write scalar code that can spawn many multithreaded applications, is much easier to program than SIMD. We recently added Tiles to help people program tensor cores and mathematical structures, which are critical for modern AI.
We have thousands of tools, compilers, frameworks, libraries, and open-source software, supporting hundreds of thousands of public projects. CUDA is now integrated into practically every ecosystem.
This chart basically describes NVIDIA's entire strategy. You've heard me talk about this slide from the very beginning. At the end of the day, the hardest thing to build is that bottom layer: the installed user base. It took us 20 years to build hundreds of millions of CUDA-running GPUs and computing systems globally. We are present across all cloud platforms, across all computer companies, serving almost every industry.
CUDA's installed base is what drives its accelerating flywheel effect. A large user base attracts developers, developers in turn create new breakthrough algorithms like deep learning, and many other examples. These breakthroughs spawn entirely new markets, and new ecosystems are built around these markets, with other companies joining in, which expands the user base further. This flywheel is accelerating right now.
Downloads of NVIDIA libraries are growing at an astonishing rate. It's larger than ever, and growing faster than ever before. It is this flywheel effect that allows this computing platform to support so many applications, deliver so many new breakthroughs, and most importantly, it gives this infrastructure an extremely long lifespan.
NVIDIA CUDA supports a huge range of applications. We support every stage of the AI lifecycle. We are optimized for all data processing platforms. We deliver acceleration to all types of science-driven problem solvers. As a result, once you install an NVIDIA GPU, it has an extremely wide range of use cases, with a very high rate of applicable scenarios. This is one of the reasons why Ampere, which we shipped around six years ago, still sees increasing pricing in the cloud. All of this is possible fundamentally because of its large user base, strong flywheel effect, and wide developer coverage.
As all this happens, and we continue to update our software, computing costs go down. The combination of accelerated computing technology drastically improves application speed. At the same time, we continue to maintain and update software throughout its entire lifecycle. You don't just get an initial performance boost—over time, accelerated computing continues to bring down costs.
We are committed to nurturing and supporting every NVIDIA GPU ever made, because they are all architecturally compatible. We do this because if we release a new optimization, it reaches a huge installed user base, benefiting millions of people all over the world. It is this dynamic combination that allows the NVIDIA architecture to continuously expand its application scope, grow faster, lower computing costs, and ultimately drive new growth.
So CUDA is at the core of all this. But our journey actually started 25 years ago.
I know how many of you grew up playing games on GeForce. GeForce is NVIDIA's most successful marketing campaign. We started attracting future customers long before you could pay for it yourself—your parents paid. Your parents paid to make you an NVIDIA customer, they paid on time every year, year after year, until one day you became an excellent computer scientist, and a paying customer yourself, and a developer.
This is the foundation that GeForce built. We started this journey 25 years ago, and that led to the birth of CUDA. 25 years ago, we invented programmable shaders, a completely unexpected invention that made accelerators programmable, creating the world's first programmable accelerator: the pixel shader. 25 years ago, this pushed us to explore deeper and deeper. 20 years later, 5 years after that exploration, CUDA was invented. This was one of our biggest investments, one we could barely afford at the time. It consumed the vast majority of our company's profits, all to bring CUDA technology to every PC via GeForce graphics cards. We poured our hearts into building this platform because we believed so strongly in its potential. But at the end of the day, it was the company's commitment, even when it was hard at the start, every day driven by belief.
Thirty years, thirteen generations of products, and now CUDA is installed everywhere. Of course, pixel shaders led to the GeForce revolution, about ten years ago we introduced what it would become. Eight years ago, we launched RTX, a complete redesign of our architecture for the modern computer graphics era.
GeForce brought CUDA to the entire world. As a result, GeForce powered Alex Krizhevsky, Ilya Sutskever, Geoff Hinton, Andrew Ng, and many others, who discovered that GPUs could be a great tool to accelerate deep learning. It ignited the AI explosion.
Ten years ago, we decided to combine programmable shading technology with two new ideas: hardware ray tracing, which is extremely difficult to implement, and the idea that AI would completely transform how computer graphics are made. Just as GeForce brought AI to the world, AI is now completely transforming how computer graphics are created.
Today, I want to show you something from the future. This is our next-generation graphics technology, we call it neural rendering—the fusion of 3D graphics and AI. This is DLSS 5.0.
(Video Demonstration)
Computer graphics are incredibly lifelike. So what did we do? We merged controllable 3D graphics with the realistic look of the virtual world. The virtual world is structured data—remember that term. We combined 3D graphics, structured data, with generative AI. One is fully predictable, the other is probabilistic but incredibly realistic. We combined these two ideas: control via structured data, perfect controllability, while also generating data. As a result, the content is