HomeArticle

NVIDIA CEO Jensen Huang: NVIDIA builds a new "AI Factory" every year, with AI performance increasing by 2 to 3 times annually | Frontline

田哲2024-10-16 19:46
The reason for the doubling of performance is to establish the entire data center and AI factory end-to-end and develop software.

Written by | Tian Zhe

Edited by | Su Jianxun

On October 16, Lenovo held "Lenovo Tech World" in Seattle, USA, announcing a series of recent product technologies and cooperation dynamics.

Among them, Lenovo announced to further expand cooperation with NVIDIA, jointly releasing the Hybrid AI Advantage Set and the GB200 Liquid-cooled AI Server - ThinkSystem SC777 system.

Yang Yuanqing, CEO of Lenovo Group, said that the Hybrid AI Advantage Set is an end-to-end AI platform that can promote the realization of enterprise AI, thereby optimizing processes, making better decisions, and improving productivity.

He also introduced that the ThinkSystem SC777 system is equipped with NVIDIA Grace Blackwell GB200, featuring a 100% water-cooling design that can dissipate heat without the need for fans or data center air conditioning. This system can be installed in a standard rack, uses a standard power supply, and customers can purchase just one tray at a time.

At the event, Yang Yuanqing had a conversation with NVIDIA CEO Jensen Huang. Jensen Huang believes that agent AI will run around to help users complete various tasks, and hopes to produce billions of agent AI every year.

He expects that every company has the possibility to introduce large language models and transform them into their own agents to perform specified tasks. To meet this demand from customers, when AI performance doubles or triples every year, NVIDIA can reduce the cost, workload, and energy consumption of AI, while increasing the revenue-generating ability of AI.

The reason is that NVIDIA can establish the entire data center and AI factory from end to end, and develop software from end to end. Therefore, it can establish a new AI factory every year, doubling the performance while reducing costs, rapidly promoting the development and democratization of AI.

The AI factory mentioned by Jensen Huang refers to the new type of data center. The NVIDIA official website shows that many enterprises in multiple locations are cooperating with NVIDIA to promote the transformation of traditional data centers to accelerated computing and build AI factories to create artificial intelligence.

The following is the conversation between Yang Yuanqing and Jensen Huang, edited by 36Kr:

Yang Yuanqing: Good afternoon! Thank you for your continuous attention and support to our enterprise AI solutions. Let's take a quick review. Today, we have shared our understanding of the future defined by hybrid artificial intelligence, demonstrated our innovations in the fields of personal AI and enterprise AI, and most importantly, we have reaffirmed the vision of "Smarter AI for All". And all of this cannot be achieved without a very important partner. Let's welcome NVIDIA CEO Jensen Huang!

Jensen Huang: Thank you, Yuanqing! It's great to be here.

Yang Yuanqing: We discussed hybrid artificial intelligence at last year's Tech World. Since then, both sides have made many progresses. The audience must also be eager to hear your views. What do you think is the next step for hybrid artificial intelligence? How is the customer adoption?

Jensen Huang: First of all, it's great to be here again to announce a series of important new initiatives with our partner Lenovo. Yuanqing and I have known each other for a long time. We can even say that we have known each other since we were kids. We have experienced several computing revolutions together, first the PC revolution, then the Internet revolution, and later the mobile cloud revolution. And now, we are reshaping the entire architecture of the computing field on an unprecedented scale.

What we used to call "programming" has now become "machine learning". Programming is achieved through the CPU, while machine learning is achieved through the GPU. It is amazing that programming gave birth to software and drove the entire huge industry. And now, machine learning is creating artificial intelligence, which will be the largest industrial revolution we have ever seen. In the past 12 months, we have seen amazing progress in various industries. Every enterprise, every industry, and every country has realized that their digital intelligence and data can be written and transformed into the data intelligence of their country, enterprise, or industry.

Of course, a major event that happened recently is the large model Llama 3 mentioned by Mark Zuckerberg. It really changed the game rules. Because of the emergence of Llama 3, every company now has the opportunity to access AI as long as they have the required infrastructure and architecture. With AI computers and AI infrastructure, as well as the very critical software stack, we can transform large models into artificial intelligence. Artificial intelligence is closely related to large models. We need large language models, very complex large language models and technologies, but (it should be remembered that) large language models are an important part of artificial intelligence. What I saw just now in the released content - the complete architecture that we, as a company and partners, have worked together to develop, is precisely to establish the required infrastructure, software stack, and best practices and blueprints. So that we can transform large language models into agents that really help us complete tasks.

Yang Yuanqing: Then what is your view on agent AI and physical AI?

Jensen Huang: Regarding agent AI, when it comes to artificial intelligence, ladies and gentlemen, we hope to produce millions or even billions of such AI. At NVIDIA, we call these "Toy Jensens". They will run around to help you complete various tasks. No matter what you need to do, they will meet your requirements. Broadly speaking, artificial intelligence is essentially robotics.

In the future, there will be digital robots, which we call agents. They have the ability to understand your instructions, understand the meaning of the instructions, break them down into specific actions, use tools, retrieve proprietary information or any information they can access, complete tasks and take actions when necessary. So, they can sense, reason and perform actions. This basic cycle is the core cycle of robotics. Therefore, we will have information robots, which we call agents.

We will also have physical robots. These physical robots are essentially AI that understands the physical world. In addition, we also have agents that understand the information world. These two types of artificial intelligence - agent AI and physical AI, will become the cornerstone of the global industry. We will have AI "colleagues", who may be good at marketing, or in our company, they can be engaged in chip design, software programming, verification or building agents to assist us in supply chain management.

These agents work in collaboration with all our employees, thereby greatly improving our productivity. Essentially, what we want to achieve is "superhuman" productivity, right? Therefore, all employees will receive timely support from these agents to improve productivity. Next, we will also do this. And the greatest opportunity in this is industrial AI, which is closely related to robotics technology, and we are cooperating with Lenovo in this field.

Yang Yuanqing: I also see the same trend in agent AI and physical AI. To seize this opportunity, Lenovo and NVIDIA will have a major announcement today, namely the "Lenovo Hybrid AI Advantage Set".

This is an end-to-end AI platform for developing and deploying AI in the new era. It is based on industry-leading infrastructure, including AI devices, AI servers, storage, as well as edge computing, public cloud, and private cloud. It is where enterprises store, clean and organize data. Data algorithms and accelerated computing capabilities jointly promote the realization of enterprise AI, thereby optimizing processes, making better decisions, and improving productivity.

Among all these elements, services play an important role - providing full lifecycle management of the infrastructure including design, deployment, expansion, and maintenance, improving data analysis and governance capabilities, providing consulting and tuning for AI models, and expanding the AI software ecosystem for customers. Therefore, on the basis of the concept of the hybrid AI factory, we also provide the Lenovo AI Application Library. Our strategy is to combine modularization and customization to quickly respond to customer needs and tailor solutions for them.

All these areas together constitute the complete system of the "Lenovo Hybrid AI Advantage Set" in cooperation between Lenovo and NVIDIA. We will work with you to make the "Hybrid AI Advantage Set" more mature and perfect. Then, would you like to talk more about how NVIDIA's technology helps build this platform?

Jensen Huang: Yuanqing, when you and I first met, it was during the PC revolution. At that time, the PC architecture was simple, with a CPU, an operating system, and applications. Of course, that computing model was also revolutionary at that time, but compared to today, it is too simple. The birth of artificial intelligence technology has taken a very long time for the entire industry because it is extremely complex. Its entire architecture includes a real supercomputer as the computing infrastructure, running algorithms and distributed computing in a computing structure including NVLink/Switch, InfiniBand or Spectrum-X networks. This distributed computing capability is realized through very complex software and becomes efficient.

So this is the computer. But on top of it is the large language model, which may be the new operating system in the artificial intelligence world. On top of the large language model, there are many other application libraries. These application libraries, in a simple way to understand, are like helping a new AI colleague to onboard. He is an AI colleague who comes to perform tasks, and your task is to create data to correct and help the AI to onboard. It is your AI agent, and you need to teach him very specific skills.

You will provide them with information so that they can carry out their work. You evaluate them as you evaluate employees. You deploy them, set guardrails for them, protect them, and ensure that the functions they perform are in line with the functions they are taught to perform. And Blackwell makes the agents get better and better over time. This entire set of application libraries is fully accelerated and runs on the GPU.

This makes it possible for every company to introduce large language models and transform them into their own agents to perform the specific tasks you want them to perform, such as customer service, AI database retrieval, and many different types of applications.

Yang Yuanqing: For this, NVIDIA is indeed providing a lot of value. There are corresponding platforms at all levels. (We need to) meet the higher goals of customers - they not only expect to obtain more powerful accelerated computing capabilities, but also try to achieve energy efficiency goals. Then, how does NVIDIA help achieve such needs of customers?

Jensen Huang: In the early stages of the new computer revolution, your best choice is to accelerate your roadmap. Here's why: When performance can double or triple - and you know this is an annual growth - we can effectively reduce the cost of AI, reduce the workload and energy consumption of AI, but at the same time, we can also increase the revenue-generating ability of AI. These new computers are equivalent to highly efficient factories.

Unlike all data centers that store files, these data centers are AI factories that produce tokens. You would hope that the production speed of these tokens - these tokens are actually artificial intelligence - you would hope that they are produced at the highest possible speed.

If we can increase the speed by two or three times every year, we can help you reduce costs and increase revenue. So this is very important. We have this ability because we have established the entire AI factory from end to end, from CPU, GPU, NVLink/Swtich, InfiniBand Switch, networked chips to Ethernet switches and networking technologies, etc.

We have the ability to establish the entire data center and AI factory from end to end, and we also have the ability to develop software from end to end. Because of this, we can establish a new AI factory every year, doubling the performance while reducing costs, and promoting the development and democratization of AI at the fastest speed.

Yang Yuanqing: All our R & D teams are shouldering the mission of designing for sustainability.

Jensen Huang: Now, speed is sustainability, speed is performance, and speed is energy efficiency.

Yang Yuanqing: There is no doubt that NVIDIA is indeed doing an excellent job in terms of sustainability. Lenovo also brings core technologies. We have been in a leading position in liquid cooling technology for ten years, and now it is the sixth generation, which is our Neptune liquid cooling system technology. Then, let's show the products using the sixth-generation Neptune liquid cooling system.

Jensen Huang: Sure, please. You know, Yuanqing, all the efforts Lenovo has made in building high-performance computers over the years are worth it. Isn't it?

Yang Yuanqing: Of course. We have been researching water-cooling technology for more than ten years. This is the ThinkSystem SC777 system designed by Lenovo, equipped with NVIDIA Grace Blackwell GB200. It adopts a 100% water-cooling design, so it does not require any fans or dedicated data center air conditioning. This system can be installed in a standard rack, uses a standard power supply, so customers can purchase just one tray at a time. It also includes NVIDIA NVLink interconnect technology and supports NVIDIA networks and NVIDIA AI enterprise software. What do you think about this?

Jensen Huang: Right, this is really beautiful. For an engineer, this is very sexy.

Yang Yuanqing: Thank you, Jensen, and thank you for our important partnership. In addition to cooperation in the enterprise data center, we have also cooperated with NVIDIA in the automotive computing field and launched the NVIDIA DRIVE AGX Thor platform with the new Blackwell architecture. Now, let's show Lenovo's smarter and more powerful AI computing DCU with Jensen.

Jensen Huang: Yuanqing, do you know how amazing this is? The architecture of this computer and that computer is exactly the same. That computer is used to create the "brain" that runs on this computer. Because the car is the highest-producing robot in the world.

Yang Yuanqing: Yes, exactly, a robot on wheels.

Jensen Huang: It can sense the world, think about action plans and control the steering wheel like a robot. Therefore, this will be the first batch of the highest-producing robots in the world. But in the future, we will have various robot systems, and this special computer can be used for all these systems. So, yes, it's very exciting. Thank you.