Skip to main content

OpenAI needs just 15 seconds of audio for its AI to clone a voice

In recent years, the listening time required by a piece of AI to clone someone’s voice has been getting shorter and shorter.

It used to be minutes, now it’s just seconds.

Recommended Videos

OpenAI, the Microsoft-backed company behind the viral generative AI chatbot ChatGPT, recently revealed that its own voice-cloning technology requires just 15 seconds of audio material to reproduce someone’s voice.

Please enable Javascript to view this content

In a post on its website, OpenAI shared a small-scale preview of a model called Voice Engine, which it’s been developing since late 2022.

Voice Engine works by feeding it a minimum of 15 seconds of spoken material. The user can then input text to create what OpenAI describes as “emotive and realistic” speech that “closely resembles the original speaker.”

OpenAI insists it is taking a “cautious and informed approach to a broader release due to the potential for synthetic voice misuse,” adding that it wants to “start a dialogue on the responsible deployment of synthetic voices, and how society can adapt to these new capabilities.”

It added: “Based on these conversations and the results of these small scale tests, we will make a more informed decision about whether and how to deploy this technology at scale.”

One of the misuses that OpenAI refers to is a scam that some criminals are already carrying out using similar technology that’s been publicly available for some time. It involves cloning a voice and then calling a friend or relative of that person to trick them into handing over cash via a bank transfer. There are also fears about how such technology might be used in the upcoming presidential election, an issue highlighted by a recent high-profile incident in which a robocall using a clone of President Joe Biden’s voice told people not to vote in January’s New Hampshire primary.

Another concern is how the rapidly improving technology will impact the livelihoods of voice actors who fear that they’ll be increasingly asked to sign over the rights to their voice so that AI can be used to create a synthetic version, with compensation for such a contract likely to be much lower than if the actor was asked to perform the job in person.

Looking at more positive deployments of the technology, OpenAI suggests that it could be used to provide reading assistance to non-readers and children using natural-sounding, emotive voices “representing a wider range of speakers than what’s possible with preset voices,” as well as instant translation of videos and podcasts, something that Spotify is already trialing.

It could also be used to help patients who are gradually losing their voice through illness to continue communicating using what sounds like their own voice.

OpenAI has some examples of the AI-generated audio and the reference audio on its website, and we’re sure you’ll agree that they’re pretty extraordinary.

Trevor Mogg
Contributing Editor
Not so many moons ago, Trevor moved from one tea-loving island nation that drives on the left (Britain) to another (Japan)…
ChatGPT vs. Perplexity: battle of the AI search engines
Perplexity on Nothing Phone 2a.

The days of Google's undisputed internet search dominance may be coming to an end. The rise of generative AI has ushered in a new means of finding information on the web, with ChatGPT and Perplexity AI leading the way.

Unlike traditional Google searches, these platforms scour the internet for information regarding your query, then synthesize an answer using a conversational tone rather than returning a list of websites where the information can be found. This approach has proven popular with users, even though it's raised some serious concerns with the content creators that these platforms scrape for their data. But which is best for you to actually use? Let's dig into how these two AI tools differ, and which will be the most helpful for your prompts.
Pricing and tiers
Perplexity is available at two price points: free and Pro. The free tier is available to everybody and offers unlimited "Quick" searches, 3 "Pro" searches per day, and access to the standard Perplexity AI model. The Pro plan, which costs $20/month, grants you unlimited Quick searches, 300 Pro searches per day, your choice of AI model (GPT-4o, Claude-3, or LLama 3.1), the ability to upload and analyze unlimited files as well as visualize answers using Playground AI, DALL-E, and SDXL.

Read more
​​OpenAI spills tea on Musk as Meta seeks block on for-profit dreams
A digital image of Elon Musk in front of a stylized background with the Twitter logo repeating.

OpenAI has been on a “Shipmas” product launch spree, launching its highly-awaited Sora video generator and onboarding millions of Apple ecosystem members with the Siri-ChatGPT integration. The company has also expanded its subscription portfolio as it races toward a for-profit status, which is reportedly a hot topic of debate internally.

Not everyone is happy with the AI behemoth abandoning its nonprofit roots, including one of its founding fathers and now rival, Elon Musk. The xAI chief filed a lawsuit against OpenAI earlier this year and has also been consistently taking potshots at the company.

Read more
ChatGPT has folders now
ChatGPT Projects

OpenAI is once again re-creating a Claude feature in ChatGPT. The company announced during Friday's "12 Days of OpenAI" event that its chatbot will now offer a folder system called "Projects" to help users organize their chats and data.

“This is really just another organizational tool. I think of these as smart folders,” Thomas Dimson, an OpenAI staff member, said during the live stream.

Read more