“Gucci Mane crazy, I might pull up on a zebra/ Land on top a eagle, smoke a joint of reefa.”
That’s a Gucci Mane lyric from his 2010 track “It’s Gucci Time” from the album The Appeal: Georgia’s Most Wanted.
“It is a truth universally acknowledged/ that a single man in possession of a good fortune, must be in want of a wife.” That’s also, now, a Gucci bar, albeit one originally written by Jane Austen in her 1813 novel of manners, Pride and Prejudice, although Gucci imbues it with a level of trap rap swagger that doesn’t quite come across in other readings of the classic English text. (By comparison, the top Audible entry for the same novel is read by the decidedly non-trap rap superstar Rosamund Pike.)
Gucci, as it turns out, has been busy — busier even than he was during the 2010-2015 period when he was issuing mixtapes at a dizzying rate of roughly one per month. Today, the 41-year-old rapper debuted voice readings of himself reading an assortment of classic novels under the somewhat brilliant title “Project Gucciberg.” A smattering of the novels include Alice’s Adventures in Wonderland, Little Women, A Modest Proposal, Dracula, and The Importance of Being Earnest.
Only he didn’t. Well, not exactly.
It’s more deepfake audio wizardry, this time courtesy of the folks at New York-based digital arts collective MSCHF. Fresh off their last project — in which they attached a paintball gun to one of Boston Dynamics’ Spot robots, and allowed users to remotely control it over the internet — the team has lent their button-pushing, tech-savvy brand of prankster irreverence to a project in which the rapper born Radric Delantic Davis is, himself, remote-controlled (at least, his words are) to narrate a slew of vintage novels.
Evil geniuses
MSCHF’s Daniel Greenberg told Digital Trends: “Gucci Mane is one of the most impactful musicians in the history of rap. Project Gutenberg is one of the last bastions of public domain texts on the internet. By combining the two, using the power of A.I. technology, we have created the most impactful rapper-read public domain audiobooks in the history of the internet.”
To create their (totally unauthorized) literature-loving A.I. rapper, the team crafted a training dataset of around six hours of Gucci’s speech, pulled from interviews, podcasts, and whatever other publicly accessible audio footage they could scavenge from YouTube. This source material was then edited, trimmed down into 10-second segments, EQ’d, transcribed, and labeled.
“Additionally, our team built out a Gucci pronunciation key/dictionary to better capture the idiosyncrasies of Gucci Mane’s particular argot,” Greenberg said. He added, “Seriously, this thing is the equivalent of a linguistics thesis.”
The dataset was then used to train an A.I. model, repeatedly massaged so that it improved the output, and then augmented with human touches to add flair like pregnant pauses into the text where required.
“It may sound like Gucci is speaking into a broken microphone at times, or on a bad audio stream — because he was in a lot of our source material,” Greenberg admitted. “However, barring these environmental factors, we feel the actual voice emulation is extremely successful. It is both amazing and scary how good this technology is to make anyone say whatever you want.”
The real Gucci Mane did not respond to a request for comment. However, this is, as Greenberg acknowledged, something of a “gray area” when it comes to copyright. “The copyright implications of deepfakes have not yet been legislated,” he said. “All of the audio samples we trained our model on were publicly available through interviews. At the end of the day, we have a voice that is not ours, reading public domain text that we didn’t write, but we are creating our ‘own’ audiobooks.”
Deepfake-A-Thon
Last year, Jay-Z’s Roc Nation LLC entertainment agency took issue with an audio deepfaker who used the rapper’s voice to spout gibberish like the Navy Seal Copypasta on YouTube. It was, as I noted at the time, a brain-teasing conundrum for a rapper who once rapped the line “I sampled your voice, you was usin’ it wrong” during his early 2000s beef with Nas. But Roc Nation wasn’t getting into the ironic complexity of the case. They were just annoyed about someone “unlawfully [using] an A.I. to impersonate our client’s voice.”
It’s not difficult to see why an artist might be perturbed by such a thing. Like the visual deepfakes that place actors in movies in which they never appeared (or, as is doing the rounds recently, Tom Cruise in a series of hyperactive TikTok videos), an audio deepfake of an artist takes their most valuable asset — their voice, in this case — and uses it to create something they never consented to perform in. There are both ethical and financial issues at stake.
“The history of rap is the history of self-reference,” Greenberg maintained. “Throughout the entire canon of the tradition, throughout the body of a given performer’s work. When you peek under the hood of an A.I. learning model, there’s an uncannily similar process occurring — a kind of hyper-self-reference. Oblique as it may seem, this all dovetails quite nicely.”
Should we be worried about the risk of audio deepfakes in a world where real and fake can be blurred to a startling degree?
“Absolutely, but alarm won’t stop deepfakes from becoming more and more mainstream,” he said. “This technology is here to stay — we should be so lucky if it’s only ever used for fun. Maybe doing fun things with it will help keep us in that realm. We have reached an inflection point where truth and fiction are becoming impossible to discern on the internet. Thus, we realized it was crucial that we soothe our ears with Gucci Mane’s gentle A.I.-generated reading voice.”
As siren songs to usher us onto the rocks of Skynet go, maybe Gucci isn’t so bad, as it happens. Especially if it could be 2009-era Gucci, circa The State vs. Radric Davis.