Nvidia introduces the H200, an AI-crunching monster GPU that may speed up ChatGPT

Key Takeaways:

– Nvidia has announced the HGX H200 Tensor Core GPU, which utilizes the Hopper architecture to accelerate AI applications.
– Lack of computing power has been a major bottleneck in AI progress, hindering deployments of existing AI models and slowing the development of new ones.
– The H200 is not for graphics but is ideal for AI applications, as GPUs perform parallel matrix multiplications necessary for neural networks.
– The H200 offers 141GB of memory and 4.8 terabytes per second bandwidth, making it more powerful than the previous Nvidia A100.
– The H200 will be available in various form factors and will be deployed by cloud service providers starting next year.
– Nvidia has faced export restrictions for its powerful GPUs, limiting sales to China, and has introduced new chips to get around those barriers.
– Expect more back-and-forth moves between the US and Nvidia regarding export restrictions in the coming months.

Ars Technica:

Enlarge / The Nvidia H200 GPU covered with a fanciful blue explosion that figuratively represents raw compute power bursting forth in a glowing flurry.

Nvidia | Benj Edwards

On Monday, Nvidia announced the HGX H200 Tensor Core GPU, which utilizes the Hopper architecture to accelerate AI applications. It’s a follow-up of the H100 GPU, released last year and previously Nvidia’s most powerful AI GPU chip. If widely deployed, it could lead to far more powerful AI models—and faster response times for existing ones like ChatGPT—in the near future.

According to experts, lack of computing power (often called “compute”) has been a major bottleneck of AI progress this past year, hindering deployments of existing AI models and slowing the development of new ones. Shortages of powerful GPUs that accelerate AI models are largely to blame. One way to alleviate the compute bottleneck is to make more chips, but you can also make AI chips more powerful. That second approach may make the H200 an attractive product for cloud providers.

What’s the H200 good for? Despite the “G” in the “GPU” name, data center GPUs like this typically aren’t for graphics. GPUs are ideal for AI applications because they perform vast numbers of parallel matrix multiplications, which are necessary for neural networks to function. They are essential in the training portion of building an AI model and the “inference” portion, where people feed inputs into an AI model and it returns results.

“To create intelligence with generative AI and HPC applications, vast amounts of data must be efficiently processed at high speed using large, fast GPU memory,” said Ian Buck, vice president of hyperscale and HPC at Nvidia in a news release. “With Nvidia H200, the industry’s leading end-to-end AI supercomputing platform just got faster to solve some of the world’s most important challenges.”

For example, OpenAI has repeatedly said it’s low on GPU resources, and that causes slowdowns with ChatGPT. The company must rely on rate limiting to provide any service at all. Hypothetically, using the H200 might give the existing AI language models that run ChatGPT more breathing room to serve more customers.

4.8 terabytes/second of bandwidth

The Nvidia H200 GPU.
Enlarge / The Nvidia H200 GPU.

Nvidia

According to Nvidia, the H200 is the first GPU to offer HBM3e memory. Thanks to HBM3e, the H200 offers 141GB of memory and 4.8 terabytes per second bandwidth, which Nvidia says is 2.4 times the memory bandwidth of the Nvidia A100 released in 2020. (Despite the A100’s age, it’s still in high demand due to shortages of more powerful chips.)

Nvidia will make the H200 available in several form factors. This includes Nvidia HGX H200 server boards in four- and eight-way configurations, compatible with both hardware and software of HGX H100 systems. It will also be available in the Nvidia GH200 Grace Hopper Superchip, which combines a CPU and GPU into one package for even more AI oomph (that’s a technical term).

Amazon Web Services, Google Cloud, Microsoft Azure, and Oracle Cloud Infrastructure will be the first cloud service providers to deploy H200-based instances starting next year, and Nvidia says the H200 will be available “from global system manufacturers and cloud service providers” starting in Q2 2024.

Meanwhile, Nvidia has been playing a cat-and-mouse game with the US government over export restrictions for its powerful GPUs that limit sales to China. Last year, the US Department of Commerce announced restrictions intended to “keep advanced technologies out of the wrong hands” like China and Russia. Nvidia responded by creating new chips to get around those barriers, but the US recently banned those, too.

Last week, Reuters reported that Nvidia is at it again, introducing three new scaled-back AI chips (the HGX H20, L20 PCIe, and L2 PCIe) for the Chinese market, which represents a quarter of Nvidia’s data center chip revenue. Two of the chips fall below US restrictions, and a third is in a “gray zone” that might be permissible with a license. Expect to see more back-and-forth moves between the US and Nvidia in the months ahead.

Source link

AI Eclipse TLDR:

Nvidia has announced the HGX H200 Tensor Core GPU, a powerful chip that utilizes the Hopper architecture to accelerate AI applications. It is a successor to the H100 GPU, which was previously Nvidia’s most powerful AI GPU chip. The H200 has the potential to enable more powerful AI models and faster response times for existing models like ChatGPT. The lack of computing power has been a major bottleneck in AI progress, and shortages of powerful GPUs have hindered the deployment of AI models and the development of new ones. The H200 offers 141GB of memory and 4.8 terabytes per second bandwidth, which is 2.4 times the memory bandwidth of the Nvidia A100 released in 2020. It will be available in various form factors and will be deployed by cloud service providers such as Amazon Web Services, Google Cloud, Microsoft Azure, and Oracle Cloud Infrastructure. Nvidia has also been facing export restrictions from the US government, but it continues to introduce new chips for the Chinese market.