Interview with Jen-Hsun Huang: My perspective on GPUs sets me apart

Author | Wan Chen

Editor | Jingyu

Advertisement

The atmosphere suddenly became serious.

“Some media believe that you are either Leonardo da Vinci in the AI ​​era or Oppenheimer in the AI ​​era. What do you think?”

“Oppenheimer makes bombs, we (NVIDIA) don't do this.” Faced with this somewhat joking question, NVIDIA founder and CEO Jensen Huang hesitated for a moment and answered it very seriously.

On March 19, local time, the day after completing the opening speech of GTC 2024 with a popularity comparable to that of a pop superstar, Huang Jensen accepted interviews with global media.

Advertisement

Huang Renxun re-explained the key points of the “concert” to the media present|Photo source: Geek Park

Whether it is a grand question such as “When will AGI arrive?” “How does NVIDIA view the Chinese market?” or how to apply the newly launched NIM software, the leader of the world's third largest company by market capitalization can break down the problem and abstract it. into a more understandable level and answered with simple metaphors. Although there may be some suspicion of “Tai Chi” in it, at least it is difficult to doubt the sincerity of the answerer.

With a new market value of two trillion, Huang believes that the GPU chip market is not what NVIDIA is pursuing – “NVIDIA does not make chips, NVIDIA makes data centers.” For this reason, NVIDIA has built everything: hardware, software, and services, so that Customers decide how to purchase their own data center.

In the GTC 2024 Keynote speech, Huang demonstrated 5 key points: Under the new industrial revolution (accelerated computing and generative AI), NVIDIA's new infrastructure includes: Blackwell platform; NIMS; NEMO and NVIDIA AI Foundry; Omniverse and ISAAC robots. |Image source: Nvidia

01.GTC China market plan for new products

Q: How much does the new network and technology plan to sell to China?Are there any Chinese specific SKU Can the information be disclosed? Have any considerations or changes been made for this market?

Jen-Hsun Huang:I haven't announced this to you yet, you're greedy (haha), that's the whole answer. Now for China, we have L20 and H20 chips that meet export requirements, and we are doing our best to organize resources for the Chinese market.

02.AI Foundry Goals

Q: You mentioned in your keynote speech AI Foudry is working for many businesses, what is the overall strategy and long-term goals of this program?

Jen-Hsun Huang:AI The goal of Foundry is to build software. This does not refer to software as a tool, anyone has such software. The two most important software created a long time ago, one is called Office, which makes the software RTS (Real-Time Software, real-time software).

Another very important software is called cuDNN (CUDA deep neural network library). We have AI all these different ones. The future library is a microservice, because the future library is not only described by mathematics, but also by AI. In the future, they will all become NIMs (microservices).

These NIMs are super complex software and all you have to do is come to our website. You can choose where the user is, or download it, run it on another cloud, or download it and run it on your local computer. This service will make them very efficient when running your workstations, your data center, so it's a new way to use it in the environment. Now, when you run these libraries as an enterprise, we have a software license available for licensing the operating system, and you can use these services for $4500/GPU/year.

03.Blackwell Pricing

Q: You said before that the latest generation AI The Blackwell chip is priced at US$30,000 to US$40,000. Is there any more precise information?

Jen-Hsun Huang:It's hard to say. I'm just trying to give everyone a feel for the pricing of our products, and I don't intend to give a specific quote.

Blackwell systems are priced very differently because everyone wants a different configuration. If not only Blackwell is used, Blackwell systems usually include NV-Link, so pricing varies for different systems. As usual, pricing ranges are often determined by TCO (total cost of ownership).

NVIDIA doesn’t make chips, NVIDIA makes data centers, we set up all the tasks, brought in all the software, and tuned it to make the data center system run as well as possible. Then, here comes the crazy part. We aggregate the data center into smaller pieces and allow customers to modify it for their specific needs. This includes networking, storage, control plane, security and management modules. Figure out how to integrate the data center into smaller parts. Integrated into the customer’s system,Ultimately, the customer decides how to buy it, so unlike selling chips in the past, Blackwell's pricing is not about the chip, and our business model reflects that.

Nvidia's opportunity is not GPU chips, it's data centers. Data centers are rapidly accelerating. This is a $250 billion annual market and growing at an annual rate of 20% to 25%, mainly due to the demand for AI. Among them, Nvidia will occupy an important share, rising from US$1 trillion to US$2 trillion, which I think is reasonable.

Jen-Hsun Huang: There is a big difference between the GPU you are talking about and the GPU I am talking about. | Image source: Geek Park

04.Sam Altman to expand into chip industry

Q: Sam Altman has been talking to people in the chip industry about expanding AI Chip size. Has he talked to you about this?

Jen-Hsun Huang:I don't know his intentions.he thinksGenerative AI will become bigI agree with this point.

The way computers produce pixels today is by retrieving data from a data set, processing the data, and then delivering the data. Throughout the process, one would think that very little energy would be consumed, but this is quite the opposite. The reason is that every time you touch the phone, every prompt, you need to race against the data set and back. Retrieving data from a dataset, using the CPU to collect all the necessary pieces, then combining the information in a way that makes sense from a recommender system perspective, and then sending the resulting information back to the user is a computationally intensive process.

It's like every time I'm asked a question, I need to run back to the office to retrieve the information, which requires a lot of energy.In the future, more and more computing will be generative rather than retrieval-based.Of course, this generation process must be intelligent and contextual. I believe that in the future, almost every pixel and every interaction on people's computers will be generated through a generative process, and I believe Sam thinks so too. It is hoped that Blackwell's new generation architecture can make a significant contribution to the field of generative AI. Most experiences today are still retrieval-based, but I would be surprised if in the future everyone’s human-computer interaction was a generative experience. This is a huge opportunity.

05. What will the personal model look like?

Q: I completely agree with your definition of future software, our lives are also passing LLM A lot has changed. What do you think the future will look like in terms of base models?

Jen-Hsun Huang: The core is, how do we have a large model of an individual? There are some ways to do it. At first, we thought that this process might need fine tuning, and we continued to fine-tune it during continued use.

But, as you know, fine-tuning is quite time-consuming. Then we discovered prompt engineering, we discovered context learning, we discovered working environment, and so on.

I think the answer will be a combination of all of these. In the future, you can fine-tune only one layer of weights called Lora, locking other parts without fine-tuning, thereby making fine-tuning at low cost. You can do prompt word co-creation, context learning, and increase model memory, all of which have been achieved. Your unique large model can be run in the cloud or on your local computer.

06. Opinions on AI chip startups

Q: After your keynote speech yesterday, chip company Groq tweeted that its chips are still faster. What do you think? AI Comments on chip startups?

Jen-Hsun Huang:I don’t know that much yet (haha), so I won’t comment.

Any model generated in token mode needs its own unique method, because Transformer is not the name of any model.

These models are generally based on Transformer technology and all utilize the Transformer attention mechanism, but there are huge differences between models. Some models use Mixture of Experts. Some mixed models have two expert models, and some have four expert models. These models wait for messages and route distribution. All the steps in the model are different. Each of the models in the model All require special optimization.

at this time,If a computing unit is designed to only do specific things in specific ways, it is a configurable computer rather than a programmable computer and cannot benefit from the speed and potential of software innovation.

Just as the wonders of the CPU cannot be underestimated, the reason the CPU has remained the CPU for so many years is that it has overcome the configurable hardware set up on PC motherboards over the years, and the talents of software engineers can be realized through the CPU. On the contrary, if you fix it on the chip, you cut off the intelligence that software engineers can bring to the chip.

This is why NVIDIA chips can perform so well under different AI model architectures, from AlexNet all the way to Transformer. NVIDIA has found a way to benefit from a very specialized form of computing.Chips are used here to facilitate software, and Nvidia's job is to facilitate invention, to facilitate things like ChatGPT invention.

07. How does robot space simulation use language models?

Q: You talked about using generative expressions AI and simulation to train robots at scale, but there are a lot of things we don’t know how to simulate well, especially when it comes to structured environments, how to break through the limitations and continue to train robots?

Jen-Hsun Huang:There are several ways to do this. First, you frame your question or idea in the context of our language model.

Large language models operate in an unconstrained and unstructured way, which is part of their potential. It learns a lot from the text, but may not be suitable for generalization.How they generalize in space is a kind of “magic”, and the robot's ChatGPT The moment may be just around the corner.

To overcome this problem, you can specify the context and problem, such as telling it to be in a kitchen under certain conditions. By applying the magic of ChatGPT, bots can effectively generalize and generate tokens that are meaningful to the software. Once your computer senses recognize these tokens, the robot can generalize based on these examples.

08. Predict the next ChatGPT moment

Q: You mentioned that some industries will usher in the ChatGPT moment first. Which industries will be the first to change? Can you share a breakthrough you've seen, a case study that was particularly exciting for you?

Jen-Hsun Huang:There are many examples.I'm right Sora Very excited to see the same capability on wayve last year, here's an example on Vincent's video.

In order to generate such a video, the model must be aware of physical laws, such as putting a person on the table, not in the middle; a person walking on the ground. The laws of physics cannot be violated.

Another example is that we use Earth-2 toForecast extreme weatherInfluence. This is a critical area of ​​research because extreme weather events can have devastating impacts on local communities. Using Earth-2, the effects of extreme weather events can be predicted at a scale of 3 kilometers. This is a significant improvement over existing methods, which require supercomputers that are 25,000 times larger.

Generate new drugs and proteinsis another very impressive potential use case. This is achieved through reinforcement learning loops like Alphago, which allow the exploration of large molecular space without consuming pure matter, potentially revolutionizing drug discovery.

These are very powerful things,robotThe same goes for technology.

In the opening speech of GTC on March 18, Lao Huang looked at the latest Blackwell architecture products | Image source: Geek Park

09. How do chip export controls affect Nvidia?

Q: What impact will export controls on chips and geopolitics have on Nvidia?

Jen-Hsun Huang:There are two things we must do immediately. First, understand all policies to ensure compliance; second, also improve supply chain resilience.

Regarding the latter, let me give you an example.when we put Blackwell chipConfigure as DGX processorAt that time, 600,000 parts came from all over the world, many from China. Just like the complexity of the global automotive supply chain, the globalization of the supply chain is difficult to break.

10. Relationship with TSMC

Q: Can you talk about your relationship with TSMC? As chip packaging has continued to become more complex over the past few years, how has TSMC helped Nvidia achieve its goals?

Jen-Hsun Huang:Working with TSMC is one of our closest collaborations because what we have to do is very difficult and they do it very well.

We got the computing unit, CPU, GPU bare chips from TSMC, and the yield is very good. The memory comes from Micron, Hynix, and Samsung, and the assembly must be completed in Taiwan. Therefore, supply chain is not an easy task and requires coordination between companies. These large companies are working with us and are gradually realizing that closer cooperation is necessary.

We obtain components from various companies, assemble them, test them with a third company, and form a system with a fourth company. Of course, this large system is ultimately built to build a supercomputer and then be tested. Eventually, we built the data center. Imagine that all the processing and manufacturing is to form a huge data center. The entire supply chain is very complex from top to bottom, because we are not just assembling, except that the chip itself is a miracle, we have made a huge and complex system.

so,When people ask me about GPU What it feels like, maybe part of me feels like it's a bit like Soc(Integrated chips) only, and what I saw were racks, cables, switches, etc.This is my model of GPU and software. TSMC is really important.

11. Cloud business strategy

Q: NVIDIA is transforming into cloud business, while other cloud vendors are making their own chips. Will they impact your pricing strategy? What is Nvidia’s cloud business strategy? Will you sell DGX cloud business to Chinese customers?

Jen-Hsun Huang:Nvidia works with cloud service providers to put its hardware and software into their clouds, and in doing so, the goal is to bring customers to their clouds.

NVIDIA is a computing platform company. We develop software and we have a group of developers who follow NVIDIA.therefore, we create demand and bring customers to cloud service providers (CSPs) using NVIDIA DGX.

12. “Contemporary Leonardo da Vinci” or “Oppenheimer”?

Q: You once said that AGI will arrive within 5 years. Has this time prediction changed? Does the accelerating arrival of AGI scare you? Some people say that you are a contemporary Leonardo da Vinci (for his versatility and contribution), while others say that you are a contemporary Oppenheimer. What do you think?

Jen-Hsun Huang:Oppenheimer makes bombs, we (Nvidia) don’t do that.

First define AGI specifically so that we can know to what extent we have reached AGI and when we have reached it.If AGI means on a large test setmath tests, reading tests, logic tests, medical exams, law exams, GMAT, SAT, etc., software programs canDo it better than most humans, or even better than everyone else, then computers can achieve AGI within 5 years.

Advertisement