Companies clamor to use Remie Michelle Clarke’s voice. An award-winning vocal artist, her smooth, Irish accent backs ads for Mazda and Mastercard and is the sound of Microsoft’s search engine, Bing, in Ireland.
But in January, her sound engineer told Michelle Clarke he’d found a voice that sounded uncannily like hers someplace unexpected: on Revoicer.com, credited to a woman named “Olivia.” For a modest monthly fee, Revoicer customers can access hundreds of different voices and, through an artificial intelligence-backed tool, morph them to say anything — to voice commercials, recite corporate trainings or narrate books.
Revoicer advertised “Olivia” with a photo of a gray-haired woman, who appeared to be of Asian descent, and a blurb: “A deep, calm and kind voice. Excelent [sic] for audio books.”
A 38-year-old brunette, Michelle Clarke looked nothing like “Olivia.” But when she hit play, she was greeted with the jarring sound of what could only be her own voice: “Hello my dear ones, my name is Olivia,” it said. “I have a soft and caring voice.”
“It’s completely bizarre,” Michelle Clarke said in an interview with The Washington Post. “When you see your voice has been shifted and tampered with … there’s something so invasive about it.”
But Michelle Clarke isn’t the only one who has found her voice seized from her control. Advances in generative artificial intelligence, technology that forms texts, images or sounds based on data it is fed, has allowed software to recreate people’s voices with eerie precision. Such software can quickly spot patterns, comparing a small sample to a database of millions of voices, allowing users to brandish simple text-to-speech tools to modify a voice to say whatever they type.
The technology burst into the public eye this month, when a music producer claimed to use AI versions of Drake and the Weeknd’s voices to build a new track, “Heart on My Sleeve,” which spread rapidly on TikTok. A number of celebrities have experienced these verbal deepfakes, including Emma Watson, whose cloned voice recited passages of Adolf Hitler’s Mein Kampf, and President Biden, who was artificially made to say he preferred low quality marijuana.
But the technology puts voice actors, the often-nameless professionals who narrate audiobooks, video games and commercials, in a particularly precarious position. While their voices are often known, they rarely command the star power necessary to wield control of their voice. The law offers little refuge, since copyright provisions haven’t grappled with artificial intelligence’s ability to recreate humanlike speech, text and photos. And experts say contracts more frequently contain fine-print provisions allowing a company to use an actor’s voice in endless permutations, even selling it to other parties.
Neal Throdes, a developer at Revoicer.com, said the company used the voice through a licensing agreement with Microsoft, which allows them unrestricted access to Michelle Clarke’s sample. Hours after The Post contacted Revoicer.com, the company pledged to remove the voice from their site. “We have taken responsibility,” Throdes said in an email, adding “Revoicer.com is not responsible for the situation [Michelle Clarke] is in.”
Several voice actors told The Post they may abandon their careers, seeing a cataclysmic future where people can obtain a voice without hiring an individual. Michelle Clarke wonders why a company would pay the $2,000 she can command for a thirty second recording when they can instead pay $27 a month for a realistic clone.
“How many other companies … are using my voice and my work and my livelihood without ever factoring me in?” Michelle Clarke asked.
Voice generating software is benefiting from a boom in generative AI, which backs chatbots like ChatGPT and text-to-image makers like DALL-E and has rapidly increased in sophistication in the last year.
While AI has long helped companies successfully mimic speech, it churned out robotic, unrealistic voices, said Zohaib Ahmed, chief executive of Resemble. AI, a company that uses artificial intelligence to generate voices.
But improvements in the underlying architecture and computing power of this software upgraded its abilities. Now it can analyze millions of voices quickly to spot patterns between the elemental units of speech, called phonemes. This software compares an original voice sample to troves of similar ones in its library, finding unique characteristics to produce a realistic sounding clone.
Before this advanced pattern recognition was possible, voice generating software needed thousands of sentences to duplicate a voice, Ahmed said. Now, these tools work with just a few minutes of recorded speech.
“You don’t need an hour … or 20 hours anymore,” Ahmed said. “You just need like a few minutes, a few seconds … to basically get something that sounds … 90 percent [accurate].”
This advancement has been a boon to some: People with degenerative illnesses, like ALS, can bank their voices using artificial intelligence. Voice cloning software allowed Val Kilmer, who lost his voice after surgery for throat cancer, to speak for his role in “Top Gun: Maverick.”
But it’s also given rise to predatory industries. People have reported the voice of their loved ones being recreated to perpetuate scams. Start-ups have emerged that scrape the internet for high-quality speech samples and bundle hundreds of voices into libraries, and sell them to companies for their commercials, in-house trainings, video game demos and audiobooks, charging less than $150 per month.
Tim Friedlander, the president of the National Voice Actors Association, an advocacy organization, said these “middlemen” start-ups provide companies a lucrative proposition: lifelike voices that can say what’s needed without having to deal with the higher costs associated with human professionals.
Friedlander added, generative AI’s impact on his industry has only started, and it’s likely to disrupt it greatly. “It’s scary,” he said. “Voice actors, unknowingly, have been training their replacements.”
‘That’s my voice’
Bev Standing was at home one afternoon when her children sent a flurry of texts asking the same thing: Mom, are you the voice of TikTok?
Standing was confused. The Canadian voice actor had done work for many clients, but TikTok hadn’t hired her to narrate anything, she said, and she certainly wasn’t getting paid by its parent company, ByteDance.
But on the app she found herself everywhere — as the voice behind TikTok’s iconic text-to-speech feature she was narrating cat videos, critiquing shoddy boyfriends, touting McDonald’s hamburgers and pitching investment tools she’d never heard of.
She wasn’t immediately angry. “For about three days, it was fun,” Standing said. “But as soon as my business brain kicked in, it wasn’t.
Standing took a job in 2018 for a client on behalf of the Chinese Institute of Acoustics and recorded her voice for a translation app. She read in the monotone style emblematic of TikTok’s narration feature, but she said there weren’t any provisions in the contract allowing them to sell her voice to other companies.
She sued ByteDance in 2021 and settled out of court for an undisclosed sum. Shortly after, TikTok removed her voice from the app. Kat Callaghan, a Canadian disc jockey, is now the voice.
While the software that cloned Standing’s voice is likely less sophisticated than current technology, Standing says she does not appreciate having her voice copied without her permission.
“That’s my voice,” she said. “You can’t just take it without paying me.”
Despite Revoicer.com pledging to take down “Olivia’s” voice, Michelle Clarke says her livelihood is still at risk. Other third-party sites could be reselling her voice. Her friends have passed along Instagram ads that she appears to be narrating, even if she hasn’t heard of the company. “The problem is not solved for me,” she said.
But as a mother of a 1-year-old boy, she thinks she may quit doing voice-over work. “There’s no right time to feel like your future is at stake,” she said. “But it’s absolutely the worst time for me now.”
Little recourse is available to voice actors. Until recently, artificial intelligence didn’t pose much of a threat to their professions, and many said they didn’t parse through contracts in detail, searching for provisions allowing a company to use their audio beyond an individual job.
Copyright law has also not matured to decide what happens when a person’s voice is mimicked for profit, leading to patchwork enforcement where celebrities can access more protections than less-known professionals. (For example, Drake’s AI generated song was quickly taken off YouTube and Spotify last week after Universal Music Group raised concerns.)
Daniel J. Gervais, an intellectual property expert and professor at Vanderbilt University Law School, said U.S. law doesn’t offer much refuge for people who’ve had their voices taken.
Federal copyright law does not protect a person’s voice, and local laws vary by state, he said. Even in California, which because of its prominence in the entertainment industry has some of the stronger voice protections, it’s difficult to assert who’s covered. The state’s law says a voice must be considered distinct — meaning identifiable — and from a well known person, making it hard for the average voice actor to be protected, Gervais said.
Friedlander said his colleagues must be vigilant in how their voices are being used on the internet and pay close attention to the details of their contracts.
Many voice actors not unionized, and Friedlander’s advocacy organization is urging actors to scan for provisions that ask for the rights to their voice in perpetuity. The organization has crafted template contracts for actors to use that gives them control over how their voice is used.
In Europe, it’s easier to get a sound recording copyrighted, and commercial scraping of such content requires permission from the recording’s owner, Gervais said. The European Union has also charted a stronger stance against artificial intelligence by proposing laws that would classify the risks of an AI system.
“There’s a huge fork in the road between Europe and the United States,” he said. “It is much more aggressive.”
‘It’s nothing like as good as me’
In late-January, Mike Cooper received an email from a company advertising a library of voice-overs for sale. He was intrigued and scrolled onto the page and quickly found his voice in the library as a sample.
“It was a very surreal moment when I clicked ‘play’ on that, and heard my own voice coming back to me,” he said.
Cooper, who lives in Asheville, N.C., said he was angry at first. But then he remembered why this happened. The company now selling his voice had likely gotten it after acquiring a firm Cooper did a few minutes of voice-over work for in 2016.
Cooper remembers a provision in his contract saying his voice could be used elsewhere. But he recalls thinking it was harmless. He was only giving the company a few minutes of his voice, he said.
“I viewed the risk as extremely small,” he said. “I was absolutely wrong.”
But Cooper said synthetically generated voices made without his input can’t offer what he can — a deep understanding of what a project needs, and a performance with emotion and intention.
“It’s nothing like as good as me,” he said.