Sir David Attenborough, wildlife documentary broadcaster and natural historian, is an international treasure who must be protected at all costs. Now 94, Attenborough is still finding new dark recesses to explore on Planet Earth — including the r/Relationships and r/AskReddit boards on Reddit. Well, kind of.
In a series of videos posted to YouTube this week and shared by Motherboard, Attenborough’s sonorous voice is used to questionable effect by being assigned to to an A.I. that reads out Reddit threads. The result is Reddit with a whole lot more gravitas than you’d normally expect from it, all thanks to a little deep learning magic.
The audio deepfake video was created by software developer Garett MacGowan, who explains his process in a special “making of” video. MacGown used Google’s text-to-speech software, but managed to give it a suitably human-sounding tone by employing a software-generated voice model trained on Attenborough’s real speech. This was not made by MacGowan himself, but instead compiled by fellow YouTuber YouMeBangBang.
The results don’t sound wholly convincing (although I’m not sure how convincing David Attenborough reading Reddit threads would ever sound). Attenborough mispronounces some words, and there’s not as much drama to his reading as you would expect from a story about Redditor relationship drama. Nonetheless, it’s another compelling piece of evidence showing how good audio deepfakes are getting.
This isn’t totally new territory. Earlier this year, we wrote about a complaint by Jay-Z’s record label over an audio deepfake of the famous rapper that popped up online. There’s no shortage of other famous person audio deepfakes, either. By far the most impressive, however, -is a Massachusetts Institute of Technology-created deepfake — combining both video and audio — showing President Richard Nixon reading out an alternate address written in the event that the 1969 Apollo moon landing went horribly wrong.
These technologies are not only advancing all the time but, as the numerous YouTube videos in the genre show, are now accessible to anyone who wants to make an audio deepfake. Fortunately, most of the use cases so far have been attempts at humor, rather than anything more malicious. Not that something couldn’t change in the future.