Everyone reading this will likely be familiar with deepfakes, the sometimes humorous and oftentimes scary technology that makes it possible to alter an existing image or video by digitally replacing a person’s likeness. But while deepfakes are well known, the audio equivalent — capable of simulating the voice of a real person — hasn’t received quite the same level of coverage.
Nonetheless, the technology to do this is out there, and getting better all the time. The latest demonstration of this tech in action comes from the startup LOVO Studio. The company has developed a new tool it claims can re-create accurate (and recognizable) human voices, complete with emotion and tone gradations to add to the realism. While the results aren’t perfect across the board, they can be eerily accurate at times.
“We take a recording data of your voice, run it through our proprietary machine learning models that learn your voice’s tone, pitch, speed, accents, habits, and other bits that truly make your voice unique, and create a clone of your voice which actually can figure out how you would speak or pronounce certain words even if they weren’t included in the original data that was fed in,” Tom Lee, co-founder of LOVO, told Digital Trends.
In a demonstration of its tech, seen above, LOVO re-creates the voices of famous public figures such as the late South African president Nelson Mandela. While this could potentially be used for damaging purposes (imagine combining these tools with deepfake videos to fabricate politicians saying things they never actually said), LOVO has its eye on useful applications.
“Imagine a radio ad on Spotify that calls out each user by their name,” Lee said. “[That would be] one million variations of the same ad, created with a few clicks, no extra recording necessary. Imagine teachers and corporate lecturers cloning their voices and turning their courses to audio files without having to record each new session. Imagine preserving the voice of your loved one and making your smart speaker talk to you in that voice. The possibilities are simply endless.”
The company’s LOVO Studio platform will be aimed at any application which requires synthetic voices. That could be marketing videos, e-learning materials, audiobook publishing, gaming companies, smart speaker companies, and more. It features a number of synthesized voices that can be used to create the perfect voiceover for any purpose, complete with control over things like tone, pronunciation, speed, and other elements.