Competing for CPUs and vying for the PC market, Jensen Huang made a splash in Taipei. Intel and AMD should be worried.
Just now, at the NVIDIA GTC Taipei 2026 conference, Jensen Huang made another appearance in his iconic leather jacket.
The first sentence of his opening speech set the tone: "Two years ago when I came here, I started talking to you about the next wave of AI. Today I can tell you that Agentic AI has arrived, and useful AI has arrived (Agentic AI has arrived. That useful AI has arrived.)."
At the NVIDIA GTC Taipei 2026 conference, Jensen Huang mentioned six key points:
First, Token economics. Tokens are now the unit of profit. A cheap chip doesn't mean you're making money, and an expensive chip doesn't mean you're losing money.
Second, the five core components of the Agent architecture: Model, Harness, Tools, Skills, and Runtime.
Third, Vera Rubin is now in full production, and shipments will start in the fall.
Fourth, the CPU Vera for the agent era is released. Compared with x86 CPUs, the task completion speed is 1.8 times faster.
Fifth, the personal computer superchip RTX Spark is released. Jensen Huang said, "All the essence of what we've learned in the past 30 years is condensed into this one chip."
Sixth, chip design has entered the Agent era. Collaborate with Cadence, Siemens, Synopsys, etc. to build autonomous AI engineers.
Token Economics: Buy More, Earn More
Token has now become the hottest word among all technology practitioners in Silicon Valley, Taiwan (China), and Shenzhen. Jensen Huang said, "Tokens are now the unit of profit. Each token represents revenue. AI companies want to build more tokens and more AI factories."
The starting price of a 1 - gigawatt AI factory project is 20 - 30 billion US dollars. It will soon reach 60 billion and 80 billion. It's 10 billion US dollars per gigawatt. Global technology giants are frantically building AI infrastructure, and computer manufacturers in Taiwan (China) have been extremely busy recently. Jensen Huang said to the industry chain on - site, "You're all so busy, and the enterprises in Taiwan (China) are doing a great job." Behind this statement is the celebration of the entire semiconductor supply chain.
This is Token economics. In the traditional IT era, buying servers was a cost, and computing was a consumption. In the AI era, buying GPUs is an investment, and computing is revenue. Jensen Huang directly drew a line: A cheap chip doesn't mean you're making money, and an expensive chip doesn't mean you're losing money. The cost of choosing the wrong architecture has never been so high. If the throughput per watt of your AI factory is not high enough, the more you buy, the more you'll lose. If the throughput per watt is high enough, the more you buy, the more you'll earn.
Two years ago, Jensen Huang said the next wave would be Agent AI. Today he said, "Autonomous AI has arrived, and useful AI has arrived."
Jensen Huang presented a set of data: The number of GitHub commits soared from 300 million in 2023 to 500 million in 2026. It nearly tripled in two years. There are 30 million software developers globally, with a total salary of 3 trillion US dollars, creating a productivity of 9 trillion US dollars.
Jensen Huang refuted the claim that AI will lead to unemployment: "Some people say AI will make programmers unemployed. It's pure nonsense. The number of engineers is increasing. Since each engineer can create three times the output, of course, enterprises want to recruit more." The value of AI lies not in replacement but in amplification. It enables the output capabilities of each developer and enterprise to grow exponentially. When each software engineer can create three times the value, enterprises have no reason to reduce recruitment; instead, they will expand it. This is the future Jensen Huang sees: A productivity revolution is taking place, and the speed of this revolution is faster than anyone expected.
Agent Architecture: Five Core Components
In the past forty years, the working mode of computers has never changed: Start an application, click to input, and wait for the result. The Agent era is completely different. Users only need to describe their intentions, and AI will automatically generate code or use tools to produce the necessary output.
In traditional computing, software is a binary package that runs inside the operating system and is restricted by the operating system's scheduling and constraints. The computing mode of Agent is heterogeneously distributed - the model, harness, tools, skills, and runtime are distributed in different locations in the data center and are coordinated by the CPU.
Jensen Huang detailed the five core components of the Agent:
Jensen Huang clearly pointed out: "This agent consists of model, harness, tools and skills, and a runtime."
Model: It acts as the "brain" and is responsible for understanding, observing, reasoning, and planning. Large language models have integrated synchronous conversion capabilities and can now perform thinking tasks excellently.
Harness: It is the "operating system" that connects everything. During each context processing, it precisely routes information, understands what's happening, and coordinates the components to work together. The distinction between working memory and long - term memory becomes crucial here.
Tools: They can be spreadsheets, web browsers, data processing engines, database engines, C compilers, Python interpreters, JavaScript engines, or even accelerated computing libraries. Whenever the Agent uses tools, the CPU is called to handle these requests.
Skills: This is a breakthrough that Jensen Huang particularly emphasized. Skills are essentially the instruction manuals for tools. After AI reads them, it says, "This is how it's used." All of NVIDIA's CUDA X libraries will now be equipped with AI - learnable skills. The Agent's ability to use these libraries will far exceed that of human programmers.
Runtime: It is the execution environment that coordinates all components. The security control device runs on the CPU and DPU security processor to monitor the entire process. Memory management is the most difficult part - working memory is similar to a KV cache and needs to handle compressed, retrieved, structured, and unstructured data.
The computing of Agent is distributed and heterogeneous. This brings huge technical challenges: When the computing is decomposed, the bandwidth between CPU cores, between the CPU and storage devices, and between the CPU and GPU becomes a bottleneck. When data flows inside and outside the chip, there should be no tri - state loss and no crossing of chip boundaries. The communication delay across chips must be extremely low.
The new applications of Agent are fundamentally different from the operation mode of past applications. The constraints of past applications came from the operating system, while the constraints of Agent come from the architecture itself - the characteristics of distributed computing determine that it must run efficiently in a heterogeneous environment.
It is precisely this heterogeneous computing problem that prompted NVIDIA to develop Vera Rubin.
Vera Rubin in Full Production, Shipments to Start in the Fall
Today, Jensen Huang announced that Vera Rubin is accelerating full - scale production, and the products will start shipping this fall.
Vera Rubin is NVIDIA's largest - scale POD - level platform to date - five dedicated racks form a huge AI supercomputer, designed specifically for agent workloads. The platform integrates the Vera Rubin NVL72 system, Vera CPU, Groq 3 LPX, Vera BlueField - 4 STX storage, and Spectrum - 6 SPX Ethernet rack into a fully integrated system. Compared with the previous - generation NVIDIA Grace Blackwell platform, the large - scale agent throughput of Vera Rubin has increased by 10 times.
Jensen Huang said, "Vera Rubin is born for this moment - it is an artificial intelligence factory engine that can provide intelligence on a large scale and has the performance, efficiency, and security required to drive the next industrial revolution."
Previously, it took two hours to assemble a Grace Blackwell rack, but now it only takes five minutes. There are no cables, no hoses, no fans, and there is only a PCB in the middle connecting the two sides. When Jensen Huang showed this comparison, his tone was full of pride: "Last time when I showed this to you, how much time did it take? We had cables everywhere. But now there is a PCB in the middle, connecting the two parts. What used to take two hours to complete now only takes five minutes."
It's not just higher production capacity; it's a qualitative change in the deployment speed of AI factories. More importantly, the reliability is improved. Without cables, there is no risk of cable failure. Jensen Huang said, "The reliability and resilience of Rubin will be incredibly high."
Top - tier system integrators, infrastructure software, and storage partners are in full production of Vera Rubin products, including Dell Technologies, HPE, Lenovo, and Supermicro, as well as Taiwan (China) OEM giants such as AIC, Compal, Foxconn, Gigabyte, Inventec, Pegatron, Quanta Cloud Technology (QCT), Wistron, and Wiwynn.
The Vera Rubin platform introduces NVIDIA Spectrum - X Ethernet photon technology, which is the world's first switch based on co - packaged optics (CPO) with 200Gb/s SerDes and is now in production.
At the same time, the Vera Rubin platform uses full - stack NVIDIA confidential computing technology to create a rack - level trusted execution environment. The Vera Rubin NVL72 integrates the Vera CPU, Rubin GPU, NVIDIA NVLink network, and security features into a unified platform and encrypts data through high - speed interconnection. This provides hardware - level authentication to ensure the system is tamper - proof.
The NVIDIA DSX platform provides a complete design and operation foundation for the Vera Rubin artificial intelligence factory - it unifies reference design, simulation, infrastructure software, facilities, and ecosystem technology to help build and operate energy - efficient artificial intelligence factories, thereby achieving the lowest Token cost.
Jensen Huang took the time to thank Microsoft, Dell, and CoreWeave because they have built engineering racks for Vera Rubin. This means that OEM partners are no longer just producing components; they are helping NVIDIA verify the entire system. Chips, heat dissipation, networks, and storage are all integrated. This is truly a one - stop delivery.
Vera CPU: The First Processor Built for Agents
Another release in this speech is NVIDIA's first processor specifically built for the AI Agent era: Vera CPU.
Jensen Huang raised a profound question: All past CPUs were designed for humans. Humans use CPUs in a world measured in seconds. Humans can wait, click to close pop - up windows, and adapt to various inconveniences. But Agents are different. Agents are impatient. They don't live in a world where every second counts; they live in a world measured in nanoseconds. When an Agent uses tools, it hopes for the fastest possible response time. When it accesses the database, it must come back as soon as possible. Every moment an Agent waits prevents it from moving on to the next step.
This is why a brand - new CPU architecture is needed. The design of traditional CPUs assumes that users can tolerate a certain amount of delay, but the requirements of Agents are completely different.
In the Vera Rubin rack, the Vera CPU undertakes three key responsibilities: First, orchestration and management. The Vera CPU is used to coordinate and manage GPU tools, manage the KV cache, and handle all the software running in the rack. In complex Agent workflows, these CPUs are the command center of the entire system. Second, security and isolation. Through Vera BlueField, the CPU is responsible for security and isolation functions to ensure that different workloads do not interfere with each other. Third, harness and entry. The Vera CPU is used for tool - use orchestration of AI models and accessing databases.
Jensen Huang pointed out that the architecture design of the Vera CPU revolves around four key features: First, single - thread performance must be extremely high; second, bandwidth per core must be extremely high; third, the total bandwidth inside and outside the chip must be extremely high; fourth, energy efficiency must be extremely high.
Compared with x86 CPUs, the task completion speed of Vera is 1.8 times faster. It can drive various workloads in all industries, including agent AI, reinforcement learning, and data processing, thereby generating more data center token revenue. Jensen Huang