HomeArticle

Jensen Huang has clearly explained his comprehensive judgment on the next decade today.

字母AI2026-06-01 19:17
NVIDIA unveils four "magic weapons".

"Computing is revenue, watts are revenue, and every token is revenue!"

The above remarks were all from Jensen Huang's speech at GTC 2026. Its content can be regarded as the "time is money" of the AI era.

Jensen Huang said that the more tokens generated per watt, the more revenue there will be.

He presented a set of data. The number of code commits on GitHub increased nearly threefold in the first few months of 2026. The $3 trillion in compensation value created by 30 million software developers worldwide is generating nearly $9 trillion in productivity.

At this GTC conference, Jensen Huang brought many new things.

The most significant one is undoubtedly the AI PC jointly designed by NVIDIA and Microsoft. The second is Vera and its complete ecosystem built for the Agent era. The third is the open - source large - model Nemotron 3 Ultra. The fourth is the physical AI Cosmos 3 and the reference humanoid robot Isaac based on it.

These things together form Jensen Huang's complete judgment on the computing model in the next decade.

01 Redefining the AI PC

Jensen Huang said that the cooperation between Microsoft and NVIDIA will redefine the concept of the AI PC.

Jensen Huang demonstrated the RTX Spark on stage.

The RTX Spark is a laptop. Its chip is called N1X, which was developed jointly by NVIDIA and MediaTek. It has a Blackwell RTX GPU with 6,144 CUDA cores, a fifth - generation Tensor Core, and supports FP4 precision. There is also a customized 20 - core Grace CPU, connected through the NVLink - C2C chip interconnection. It is equipped with 128GB of unified memory, using TSMC's 3nm process and having 70 billion transistors.

Applications such as digital biology, seismic processing, and astrophysics can all run on it. All CUDA - related applications in physics, biology, genomics, AI, computer graphics, as well as Windows applications, can run.

The biggest difference between this computer and traditional laptops is that it can run Agents locally. The Agent that Jensen Huang mentioned is an AI assistant that can understand what you say, view the screen, read files, and help you with tasks. Previously, these AIs had to be connected to the cloud to be used, but now they can run directly on your laptop.

Jensen Huang said that in the past 40 years, you used a computer by launching applications, clicking, and typing. Now, with the RTX Spark and Windows, you only need to ask, and the computer will help you complete the work. The RTX Spark integrates all the technologies that NVIDIA has accumulated over 30 years, including CUDA, RTX, and the AI platform, into a single chip. Local Agents, cutting - edge models, creative workflows, and RTX games can all run on a single laptop.

This is the personal AI computer in Jensen Huang's hands.

Microsoft has made in - depth platform optimizations for the RTX Spark.

It has implemented workload profile scheduling, allowing the Windows scheduler to more efficiently scale the workload across all 20 cores. Whether you are checking emails or running an Agent locally to debug code, the Windows scheduler will ensure that you get the best performance and efficiency from the CPU.

They have also enabled the Microsoft power and thermal management framework to maximize performance and power while keeping the device cool.

To achieve up to 128GB of memory on the RTX Spark, Microsoft has increased the upper limit of system memory accessible to the GPU, increasing the available memory for the GPU on high - memory systems, thus enabling it to load larger local AI models or render more complex projects.

They have also enhanced the way Windows manages the page size of the shared memory area on the unified memory system, ensuring that larger memory pages are available under heavy workloads, while allowing developers to flexibly optimize the memory workload requirements between the CPU and the GPU.

Microsoft CEO Satya Nadella said that their goal is to bring infinite intelligence to every home and every desk with Windows.

The data of open - source Agent projects such as OpenClaw and Hermes Agent on GitHub and OpenRouter have set records, but they have not been widely adopted on a large scale because it is not possible to run Agents safely and privately on users' main computers.

NVIDIA and Microsoft have cooperated to solve this problem. They have developed new Windows security primitives and the NVIDIA OpenShell runtime to ensure that Agents run safely under the full control of users.

The new Windows provides identity, isolation, policy, and end - to - end security capabilities for natively building and running Agents.

The NVIDIA OpenShell provides some customized functions, such as allowing users to limit what an Agent can and cannot do, intelligently routing queries to local models according to the user's privacy policy, and hiding personal information in queries sent to cloud models.

Hermes Agent and OpenClaw use this security and privacy layer in their new Windows applications. These applications allow users to easily and safely access device - side Agents, which can perform tasks in Windows applications, infer cross - application workflows, generate images and videos, write plugins and application code, and perform semantic searches on local files.

Jensen Huang demonstrated on - site how an Agent running locally on the RTX Spark helped him design a house. The Agent runs in the Open Shell sandbox and connects to the Hermes orchestration system and the cloud - based Claude Sonnet.

It selects a location, reads concept sketches, style mood boards, text requirements, and design intentions. The Agent uses the tools on the laptop to open Rhino to model the site, shape the terrain, setbacks, and building shell, propose building forms, and optimize for cost, comfort, and quality.

After the form is determined, the Agent generates the internal layout, walls, and circulation, and the rooms take shape. It adjusts at any time, automatically places doors, windows, and structural elements, and discovers and corrects errors on its own. After approval, the Agent exports the model from Rhino and imports it into Blender, with the materials and object properties transferred intact.

It adjusts the materials, selects the camera, and Blender renders the house. The Agent uses the Flux model to generate multiple perspectives and lighting conditions.

The entire process is completed by the Agent itself.

This is what Jensen Huang calls the "new PC." In the past, you used a computer by opening software, clicking the mouse, and typing on the keyboard. Now you can directly tell the Agent what you want to do, and it will operate various software to complete the task on its own.

The RTX Spark is not only designed for Agents; it is also a complete creative and gaming computer.

You can render an ultra - large 90GB 3D scene through OptiX and DLSS, edit 12K 4:2:2 video with the Blackwell decoder, run a large - language model with 120 billion parameters and a 1 - million - token context, play AAA games at 1440p resolution and over 100 frames, and support ray tracing, DLSS, and Reflex.

The RTX Spark will also support new RTX capabilities, including DLSS 4.5 ray reconstruction, using a second - generation transformer model, which will appear in Blender 5.3 and dozens of games. There is also RTX Video 4x frame generation, which will appear in ComfyUI.

The RTX Spark is a laptop. However, Jensen Huang also announced the launch of desktop and workstation versions, the DGX Spark.

It has 768GB of memory, can run large models with trillions of parameters, has a computing power of 20 petaflops, and an 8TB memory bandwidth per second, and can be placed on a desk. If you are a large - language model developer or an Agent developer, you can train and test models locally and then deploy the models to the cloud when needed.

Jensen Huang said, think about phones 15 to 20 years ago. For today's mobile phones, making calls is not the most commonly used function. The meaning of mobile phones has completely changed. PCs will also undergo similar changes. PCs in ten years will not just be tools for opening software and clicking the mouse.

ASUS, Dell, HP, Lenovo, Microsoft Surface, and MSI will launch ultra - thin Windows laptops and compact desktop PCs powered by the RTX Spark this fall, with all - day battery life and high - quality displays. Models from Acer and GIGABYTE will be launched later. Jensen Huang did not mention the specific prices.

02 Vera Rubin and the AI Factory

Subsequently, Jensen Huang announced that Vera Rubin has been fully put into production.

Vera Rubin is an AI supercomputer system on a five - rack scale, specifically designed to run Agents.

The first is the Vera Rubin NVL72, which is responsible for prompt understanding, context processing, reasoning, and planning. It is the "brain" of the Agent.

The second is the Vera CPU rack. A single liquid - cooled rack contains 256 Vera CPUs, which are responsible for coordinating models, managing memory, and invoking tools.

The third is the Groq 3 LPX rack. 256 Groq 3 LPUs are spread across 16 brackets, with an SRAM bandwidth of 40PB per second, providing ultra - low - latency token generation. The NVL72 is responsible for high throughput, and the Groq LPU is responsible for low latency.

The fourth is the Vera BlueField - 4 STX storage rack, which is where the Agent stores its memory and is responsible for storage processing, acceleration, and on - chip security.

The fifth is the NVIDIA Spectrum - X Ethernet CPO network rack, equipped with an Ethernet switch using co - packaged optics technology, 200Gb/s SerDes, and in cooperation with TSMC for chip - level packaging and ultra - high - power indium phosphide laser modules.

Vera Rubin consists of seven new chips. It uses TSMC's 3nm process and CoWoS - L packaging technology. The HBM memory comes from Micron, SK hynix, and Samsung. A Vera Rubin computing board has trillions of transistors and more than 18,000 components.

The entire rack contains 18 computing trays, 9 hot - swappable NVLink switch trays, an efficient liquid - cooling manifold, and a bus. The liquid - cooling bus can carry a current of more than 5,000 amperes, equivalent to the current of 20 electric vehicles accelerating at full speed. A total of 1.3 million components form the third - generation MGX rack design.

Compared with the previous - generation Grace Blackwell, Vera Rubin has a ten - fold increase in throughput when processing Agent tasks.

Jensen Huang said that the supply - chain scale they created for Vera Rubin is twice that of Grace Blackwell.

Previously, it took two hours to assemble a Grace Blackwell rack, but now it only takes five minutes for Vera Rubin. The reason is the design change. In the past, there were many cables and hoses in the rack, but now a PCB mid - board is used to directly connect both sides, eliminating the need for cables, hoses, and fans. It is all liquid - cooled, with a modular design and hot - swappable components.

Jensen Huang said that when developing Hopper, the most important work was pre - training. With Grace Blackwell, the focus was on inference.

"Many people say that inference is easy, but inference is money."

As models become more and more complex, it is difficult to complete inference simultaneously under high response speed, rapid interaction, and high throughput. This is the significance of NVLink 72.

Jensen Huang said that today, NVIDIA's token cost is an order of magnitude lower than that of its competitors because they have done co - design and understood the computing model of inference.

Now, in the Agent era, an Agent not only generates answers but also needs to observe, reason, plan, use tools, manage a large amount of context, process working memory and long - term memory, and spawn expert sub - Agents. Vera Rubin was born for this kind of work.

The Vera Rubin platform introduces NVIDIA Spectrum - X Ethernet photonics, which is the world's first switch based on co - packaged optics technology, with 200Gb/s SerDes, and is now in production.

What is co - packaged optics?

Traditional network switches use pluggable transceivers, which are plugged outside the switch and require additional power, heat dissipation, and space. Co - packaged optics directly packages the optical module on the switch chip and cooperates with TSMC for chip - level packaging.

This brings three benefits. First, the energy efficiency is increased by five times because the distance between the optical module and the chip is shortened, resulting in less signal loss. Second, the normal operating time of AI is extended by five times because the failure points of pluggable components are reduced. Third, the deployment time is shortened by one - third because the design is simplified, releasing more power for computing.

CoreWeave, Lambda, and Oracle Cloud Infrastructure are the first partners to adopt co - packaged optical networks. Lambda showed the unboxing of NVIDIA's first co - packaged optical samples in a blog. Jensen Huang said that by simplifying the design to release more power for computing, NVIDIA's co - packaged optical network provides the infrastructure for a million - GPU AI factory.

The Vera Rubin platform also integrates the NVIDIA BlueField - 4 DPU.

The BlueField - 4 has a software - defined network with a speed of up to 800Gb/s and built - in multi - tenant isolation. With the NVIDIA BlueField - 4 Advanced Secure Trusted Resource Architecture, customers can simplify network operations, improve tenant isolation, and gain greater control in a million - GPU AI cluster.

AI factories are increasingly processing proprietary data, regulated content, and mission - critical models in Agent workflows. This requires infrastructure security customized for autonomous Agents in shared or cloud environments because the infrastructure cannot be implicitly trusted.

The Vera Rubin platform is designed with full - stack NVIDIA confidential computing for a rack - scale trusted execution environment. The Vera Rubin NVL72 combines the Vera CPU, Rubin GPU, NVIDIA NVLink network, and security features into a unified platform, encrypting data between high - speed interconnections. This provides hardware - level authentication to ensure the system is tamper - proof.

Providing this level of protection at the POD scale also requires a programmable software layer that can execute, orchestrate, and adjust security policies across the entire system. The NVIDIA DOCA software platform provides security at each Vera Rubin platform rack and AI factory layer, protecting data, Agents, context memory, and AI inference through capabilities executed directly in the BlueField - 4 silicon.

What can DOCA do? It implements multi - tenant network isolation, zero - trust policy execution, runtime threat detection, and end - to - end encryption at a speed of up to 80