Skip to main content

Baidu’s Deep Voice 2 text-to-speech engine can imitate hundreds of human accents

baidu
Image used with permission by copyright holder
Baidu, the Beijing-based juggernaut that commands 80 percent of the Chinese internet search market, is investing heavily in artificial intelligence. In 2013, it opened the Institute of Deep Learning, an R&D center focused on machine learning. And in May, it took the wraps off the newest version of Deep Voice, its AI-powered text-to-speech engine.

Deep Voice 2, which follows on the heels of Deep Voice’s public debut earlier this year, can produce real-time speech that’s nearly indistinguishable from a human voice. All the more impressive, it needs just thirty minutes of audio to build a working model, and can imitate the regional accents of hundreds of different speakers.

Recommended Videos

That’s leaps and bounds better than early versions of Deep Voice, which took multiple hours to learn one voice.

They key is Deep Voice 2’s ability to identify similarities between hundreds of different speakers to build a working model of a human voice. Then, it autonomously derives unique voices from that model — unlike voice assistants like Apple’s Siri, which require that a human record thousands of hours of speech that engineers tune by hand, Deep Voice 2 doesn’t require guidance or manual intervention.

Baidu (sign)
Image used with permission by copyright holder

“Give it the right data, and it can learn on [its] own what sort of features are important,” Andrew Gibiansky, a research scientist at Baidu’s Silicon Valley AI Lab, told The Verge.

Baidu isn’t the only company investing in high-quality text-to-speech tech. Google’s WaveNet, a product of the company’s DeepMind division, generates voices by sampling real human speech and independently creating its own sounds in a variety of voices. Adobe’s Project VoCo transcribes human speech to editable text in real time. And Lyrebird, a Canadian AI startup, licenses algorithms that can imitate any voice with just a single minute of sample audio, create one thousand sentences in less than half a second, and can infuse the speech it creates with emotions like anger, sympathy, and stress.

But don’t expect Deep Voice 2 or WaveNet to replace Siri, the Google Assistant, or Amazon’s Alexa anytime soon — AI-powered translation apps require more resources than today’s phones can reasonably supply. But Baidu sees potential in applications like text-to-speech apps and voice-based assistants. “The ability to quickly synthesize multiple human voices will have a huge effect on products such as personal assistants and eBook readers in the future. For example, each character of your eBook could have a unique voice when you listen to the eBook.”

Kyle Wiggers
Former Digital Trends Contributor
Kyle Wiggers is a writer, Web designer, and podcaster with an acute interest in all things tech. When not reviewing gadgets…
2025 could finally be the year of a budget-friendly Samsung Galaxy Z Flip
A person closing the Samsung Galaxy Z Flip 6.

The idea of a more budget-friendly Samsung clamshell has gained steam as well-known leakers drop more and more hints that a new Galaxy Z Flip is on the way. Today, another leak from someone in the know adds even more credence to that rumor.

Ross Young made a post on X where he suggested that Samsung might release a Z Flip 7 FE in 2025 with the clamshell design fans have waited for. Young has a proven record for accurate leaks, and their work in the supply chain gives him a unique insight into what companies are working on.

Read more
Google just announced Android 16. Here’s everything new
The Android 16 logo on a smartphone, resting on a shelf.

No, that headline isn't a typo. A little over a month after Android 15 was released to the masses in October, Google has already announced Android 16 and begun rolling out its first developer beta of the newest Android version.

If this seems like a much earlier release than usual, that's because it is. We typically expect the first developer beta of the next Android update to arrive in February. For Android 16, however, Google has pushed the timeline up by a few months and launched Android 16 Developer Preview 1 in mid-November.
Why Android 16 is launching so much earlier

Read more
Here’s every Pixel phone that can download Android 16 Developer Preview 1
The Google Pixel 9 Pro XL next to the Google Pixel 8 Pro.

Even though Android 15 launched only recently, Google is already moving on to Android 16, which is much earlier than is typical. And if you have a Pixel device from the past couple of years, you can get the Android 16 Developer Preview 1 right now.

Typically, when Google releases a beta for Android, the Pixel lineup gets it first before any other phones. When Google announced Android 16 earlier today, we didn’t know exactly which Pixel models would be able to get the Developer Preview. But Google just revealed which models can run Android 16, and two of them are a bit surprising.

Read more