Skip to main content

Nvidia’s new voice A.I. sounds just like a real person

The “uncanny valley” is often used to describe artificial intelligence (A.I.) mimicking human behavior. But Nvidia’s new voice A.I. is much more realistic than anything we’ve ever heard before. Using a combination of A.I. and a human reference recording, the fake voice sounds almost identical to a real one.

All the Feels: NVIDIA Shares Expressive Speech Synthesis Research at Interspeech

In a video (above), Nvidia’s in-house creative team describes the process of achieving accurate voice synthesis. The team equates speech to music, featuring complex and nuanced rhythms, pitches, and timbres that aren’t easy to replicate. Nvidia is creating tools to reproduce these intricacies with A.I.

Recommended Videos

The company unveiled its latest advancements at Interspeech, which is a technical conference dedicated to research into speech processing technologies. Nvidia’s voice tools are available through the open-source NeMo toolkit, and they’re optimized to run on Nvidia GPUs (according to Nvidia, of course).

Get your weekly teardown of the tech behind PC gaming
Check your inbox!

The A.I. voice isn’t just a demo, either. Nvidia has transitioned to an A.I. narrator for its I Am A.I. video series, which shows the impacts of machine learning across various industries. Now, Nvidia is able to an artificial voice as a narrator, free of the usual audio artifacts that come along with synthesized voices.

Nvidia tackles A.I. voices in one of two ways. The first is to train a text-to-speech model on a speech given by a human. After enough training, the model can take any text input and convert it into speech. The other method is voice conversion. In this case, the program uses an audio file of a human speaking and converts the voice to an A.I. one, matching the pattern and intonation.

For practical applications, Nvidia points to the countless virtual assistants helming customer service lines, as well as the ones present in smart devices like Alexa and Google Assistant. Nvidia says this technology reaches much further, however. “Text-to-speech can be used in gaming, to aid individuals with vocal disabilities or to help users translate between languages in their own voice,” Nvidia’s blog post reads.

Nvidia is developing a knack for tricking people using A.I. The company recently went into detail about how it created a virtual CEO for its GPU Technology Conference, aided in part by its own Omniverse software.

Jacob Roach
Lead Reporter, PC Hardware
Jacob Roach is the lead reporter for PC hardware at Digital Trends. In addition to covering the latest PC components, from…
Nvidia doesn’t want you to know about its controversial new GPU
Logo on the RTX 4060 Ti graphics card.

Confirming previous rumors, Nvidia released its RTX 4060 Ti on Tuesday without much fanfare. Most of the best graphics cards release with a bang, but the updated 16GB RTX 4060 Ti released with barely a whimper. Nvidia is putting more weight behind Portal: Prelude RTX, on the same day it's launching a new graphics card. Weird.

This is the 16GB variant that Nvidia promised when the original 8GB RTX 4060 Ti was released in June, and although it has technically been released, you can't buy it.

Read more
I tested Nvidia’s new RTX 4060 against the RX 7600 — and it’s not pretty
AMD RX 7600 on a pink background.

Nvidia's newly released RTX 4060 is positioned as the champion of midrange, 1080p gaming, but it has some stiff competition in the form of the RX 7600 from AMD. Both GPUs are built for premium gaming at Full HD, but there are some important differences between them.

I threw both on my test bench to see which is the best graphics card for your next PC build. Although neither GPU shoots ahead of the other, there's a clear winner for value.
Spec for spec

Read more
Nvidia’s supercomputer may bring on a new era of ChatGPT
Nvidia's CEO showing off the company's Grace Hopper computer.

Nvidia has just announced a new supercomputer that may change the future of AI. The DGX GH200, equipped with nearly 500 times more memory than the systems we're familiar with now, will soon fall into the hands of Google, Meta, and Microsoft.

The goal? Revolutionizing generative AI, recommender systems, and data processing on a scale we've never seen before. Are language models like GPT going to benefit, and what will that mean for regular users?

Read more