Audiobook Narrators Fear Apple Used Their Voices to Train AI

After a backlash, Spotify paused an arrangement that allowed Apple to train machine learning models on some audiobook files.
CloseUp Of Microphone Against Gray Background
Photograph: Jack Cheeseborough/Getty Images

Gary Furlong, a Texas-based audiobook narrator, had worried for a while that synthetic voices created by algorithms could steal work from artists like himself. Early this month, he felt his worst fears had been realized.

Furlong was among the narrators and authors who became outraged after learning of a clause in contracts between authors and leading audiobook distributor Findaway Voices, which gave Apple the right to “use audiobooks files for machine learning training and models.” Findaway was acquired by Spotify last June.

Some authors and narrators say they were not clearly informed about the clause and feared it may have allowed their work or voices to contribute to Apple’s development of synthetic voices for audiobooks. Apple launched its first books narrated by algorithms last month. “It was very disheartening,” says Furlong, who has narrated over 300 audiobooks and is one of more than a dozen narrators and authors who told WIRED of their concerns with Findaway’s agreement. “It feels like a violation to have our voices being used to train something for which the purpose is to take our place,” says Andy Garcia-Ruse, a narrator from Kansas City.

The dispute led to a reversal this week from Apple and Findaway, according to labor union SAG-AFTRA, which represents recording artists as well as actors and other creatives. An email to members seen by WIRED said that the two companies had agreed to immediately stop all “use of files for machine learning purposes” for union members affected and that the halt covers “all files dating back to the beginning of this practice.”

Jane Love, SAG-AFTRA’s national director for audiobooks, confirmed that Apple’s access to files from Findaway had been halted. She says the union is still “working with Findaway toward a solution that recognizes the union’s concerns” such as, “safe storage of the recordings and data, usage limitations, and appropriate compensation.” 

Spotify declined to comment on changes made by Findaway or whether making SAG-AFTRA members’ content off-limits to Apple could be unfair to authors and narrators who are not part of the union. Apple did not respond to requests for comment.

After Furlong first learned of Apple’s algorithms being written into Findaway agreements early this month, he contacted Isobel Starling, an author he’d worked with who distributed titles with the company. She was shocked to find a clause titled “Machine Learning” near the bottom of her lengthy agreement with Findaway.

Starling says the company had not specifically informed her about that part of the agreement, nor compensated her for it. She believes she missed it because it was buried beneath more conventional sections prohibiting hate speech and sexually explicit material. Although Furlong narrated the audiobook and his voice would potentially be ingested by Apple’s machine learning algorithms, he was not party to the agreement that was signed by Starling as the book’s rights holder. 

Findaway’s machine learning clause says rights holders can revoke that part of the agreement. Starling raced to email the platform to say she’d like to exercise that right, and soon received a response saying that the company had submitted her opt-out request to Apple. Furlong says Findaway has not responded to an emailed request to withdraw all copies of his voice from Apple’s servers.

Starling believes Findaway has misused the material that authors and narrators entrusted it with. “This is immoral and illegal,” Starling told WIRED, “Rights holders have the copyrights for the audiobook production only, but no claim on the narrator’s voice.” She’s pausing the release of three upcoming titles she planned to distribute via Findaway.

Interest in automating the art of book narration has grown in recent years for business and technology reasons. Audiobook revenue has continued to grow even as book and ebook revenue has dipped, and synthetic voice technology has improved dramatically. A range of tools have cropped up that allow anyone to clone voices for synthetic narration with a click, but to advance them, companies still need hoards of data.

Across industries like entertainment and gaming, contracts that require voice actors to allow tech companies to train their AI models for generating digital narration on their work have become increasingly common, says Tim Friedlander, president of the US-based National Association of Voice Actors. Adobe, maker of Photoshop and other image software, recently began training its own AI algorithms on visual creatives’ work unless they opted out. 

“The voice is how voice actors make a living,” Friedlander added, “and this is literally taking the words out of our mouths without our consent.”

Google began offering free synthetic narration for books in 2020. When Apple announced its own set of digital audiobook narrators in January, the company said it hoped to eliminate the “cost and complexity” that producing a human-narrated audiobook can represent for small publishers and independent authors. The company’s Books app lists titles with AI narration as “narrated by digital voice based on a human narrator.”

Apple has used synthetic voice technology for years, including for the Siri virtual assistant, driving directions, and accessibility features. But some authors and narrators suspect that audio from their ebooks helped the company hone its technology to the complex task of narrating books. The length of audiobooks, the complexity of the material, and the impressive skills of talented narrators make voicing books arguably the toughest challenge for synthetic voice technology.

Applying synthetic voices to books also brings new business and cultural challenges. “Most of the companies developing these AI technologies come from the technology sector, rather than the entertainment sector,” says SAG-AFTRA’s Love. “They lack the relationships, history of protections, and reliance on approval rights voice actors have come to expect.” 

Several authors told WIRED that Findaway has emerged as a reliable distributor, offering lucrative deals to list audiobooks across several platforms. But they also say that Findaway frequently prompts people to agree to updated agreements, usually with minor changes, when they log in to their accounts. The company added the machine learning clause to its distribution agreements in 2019.

Many suspect they signed off on the machine learning clause without realizing it. “It’s on me for not initially noticing the addition and what it fully meant,” says Laura VanArendonk Baugh, an author based in Indianapolis, Indiana. “But the placement was kinda sneaky, too.”

Matthew Sag, a law and AI professor at the Emory University Law School, in Atlanta, says Spotify and Apple are probably legally in the clear unless a narrator explicitly prohibited such use of their audio in their contract with an author, or if Apple produced a like-for-like AI clone of their voice. “In terms of copyright law,” he says, “the voice actors have almost inevitably assigned all their copyright to the studio or publisher that made the initial recording.” 

Morally, authors and narrators feel it’s a different story. Jon Stine, executive director of the Open Voice Network (OVON), a nonprofit from the Linux Foundation developing ethical guidelines for conversational AI, says Findaway has breached several ethical principles by not seeking narrators’ consent or enabling proper compensation for the owner of a voice.

Some use of synthetic voices is inevitable, he says, and the only way narrators can protect their art is with contracts that clearly spell out “usage rights and compensation.” OVON has developed a standard contract that clearly spells out usage rights and compensation to help actors navigate relationships with synthetic voice firms, Stine added. 

While NAVA’s Friedlander agrees that, in general, use of synthetic voices is “not an inherently bad thing” and can help less well-known authors have work narrated, it can potentially damage the livelihoods of lesser-known voice actors. For this technology to advance ethically, Friedlander says, legislation is needed to prevent “unauthorized sharing and synthesization of voices.”

Until that happens, Lillian Rachel, a voice actor who deleted her Findaway Voices profile after learning about the machine learning clause, is hopeful of listeners’ faith in the human-to-human connection. 

“A good actor does more than just read the story. They imbue it with core emotions and bring out the subtext, elevating the written words with empathy and nuance,” Rachel says. “We bring the human lived experience to each story in a way that cannot be replicated.” 

Updated 02/14/2023, 1:10 pm EST: The headline and subheadline have been updated to better reflect the nature of the dispute.