Skip to main content

Nvidia built a massive dual GPU to power models like ChatGPT

Nvidia’s semi-annual GPU Technology Conference (GTC) usually focuses on advancements in AI, but this year, Nvidia is responding to the massive rise of ChatGPT with a slate of new GPUs. Chief among them is the H100 NVL, which stitches two of Nvidia’s H100 GPUs together to deploy Large Language Models (LLM) like ChatGPT.

The H100 isn’t a new GPU. Nvidia announced it a year ago at GTC, sporting its Hopper architecture and promising to speed up AI inference in a variety of tasks. The new NVL model with its massive 94GB of memory is said to work best when deploying LLMs at scale, offering up to 12 times faster inference compared to last-gen’s A100.

Nvidia's H100 NVL being installed in a server.
Nvidia

These GPUs are at the heart of models like ChatGPT. Nvidia and Microsoft recently revealed that thousands of A100 GPUs were used to train ChatGPT, which is a project that’s been more than five years in the making.

Recommended Videos

The H100 NVL works by combining two H100 GPUs over Nvidia high bandwidth NVLink interconnect. This is already possible with current H100 GPUs — in fact, you can connect up to 256 H100s together through NVLink — but this dedicated unit is built for smaller deployments.

Get your weekly teardown of the tech behind PC gaming
Check your inbox!

This is a product built for businesses more than anything, so don’t expect to see the H100 NVL pop up on the shelf at your local Micro Center. However, Nvidia says enterprise customers can expect to see it around the second half of the year.

In addition to the H100 NVL, Nvidia also announced the L4 GPU, which is specifically built to power AI-generated videos. Nvidia says it’s 120 times more powerful for AI-generated videos than a CPU, and offers 99% better energy efficiency. In addition to generative AI video, Nvidia says the GPU sports video decoding and transcoding capabilities and can be leveraged for augmented reality.

Nvidia says Google Cloud is among the first to integrate the L4. Google plans on offering L4 instances to customers through its Vertex AI platform later today. Nvidia said the GPU will be available from partners later, including Lenovo, Dell, Asus, HP, Gigabyte, and HP, among others.

Jacob Roach
Lead Reporter, PC Hardware
Jacob Roach is the lead reporter for PC hardware at Digital Trends. In addition to covering the latest PC components, from…
The ChatGPT app is transforming my Mac right before my eyes
The ChatGPT Mac app running in macOS Sequoia.

Apple is all in on AI for the Mac. It's called Apple Intelligence, and it's really only starting to get off the ground.

Meanwhile, OpenAI went ahead and launched its own ChatGPT app earlier this year, and supported it with a recent update that made it even more useful, bringing ChatGPT’s web-searching powers to its Mac app.

Read more
ChatGPT unveils Sora with up to 20-second AI video generation
An AI generated image of a woman who walks the streets of Tokyo.

OpenAI has been promising to release its next-gen video generator model, Sora, since February. On Monday, the company finally dropped a working version of it as part of its "12 Days of OpenAI" event.

"This is a critical part of our AGI roadmap," OpenAI CEO Sam Altman said during the company's live stream.

Read more
ChatGPT’s new Pro subscription will cost you $200 per month
glasses and chatgpt

Sam Altman and team kicked off the company's "12 Days of OpenAI" event Thursday with a live stream to debut the fully functional version of its 01 reasoning model, as well as a new subscription tier called ChatGPT Pro. But to gain unlimited access to these new features and capabilities, you're going to need to shell out an exorbitant $200 per month.

The 01 model, originally codenamed Project Strawberry, was first released in September as a preview, alongside a lighter-weight o1-mini model, to ChatGPT-Plus subscribers. o1, as a reasoning model, differs from standard LLMs in that it is capable of fact-checking itself before returning its generated response to the user. This helps such models reduce their propensity to hallucinate answers but comes at the cost of a longer inference period and slower response.

Read more