Future Tense

Could the Internet Archive Go Out Like Napster?

A gavel rests in front of the Internet Archive logo.
Photo illustration by Slate. Photos by artisteer/iStock/Getty Images Plus and Internet Archive.

Two and a half years ago, the Internet Archive made a decision that pissed off a lot of writers—and embroiled it in a lawsuit that many netizens fear could weaken the archive, its finances, and its services long into the future.

In March 2020, as bookstores and libraries joined other businesses in closing their doors, the Internet Archive tried a virtual solution. It had long offered an Open Library, which contains a massive number of scanned books that can be checked out online by users one at a time. In response to the pandemic, it temporarily lifted limits on the number of scanned copies available for checkout as well as the length of time a given book could be checked out, temporarily becoming a “National Emergency Library.” The plan was to conclude the project by June 30, 2020.

As Slate reported at the time, prominent writers including Chuck Wendig, N.K. Jemisin, and Colson Whitehead spoke out against the National Emergency Library; many called it “piracy” and condemned the archive for allegedly stealing from creators. More than two months after the National Emergency Library kicked off, Hachette Book Group, HarperCollins, Penguin Random House, and John Wiley & Sons (all members of the Association of American Publishers) sued the Internet Archive, alleging “willful mass copyright infringement.” The publishers alleged that the archive had made 127 of their books available to the public without permission, thus infringing upon publishers’ intellectual property rights and eating into their profits during a moment of economic turbulence. In response, the archive ended the National Emergency Library a little earlier than planned, on June 11, 2020.

But that lawsuit’s ongoing—and it’s recently escalated. In July, both sides of Hachette v. Internet Archive requested that the district court overseeing the case speed it up and lay down a ruling before a trial can be held. More recently, both parties to the suit also filed opposition briefs in order to further negate each other’s cases. We may find out sooner than later whether the beloved digital nonprofit will prevail in its fight against some of the world’s biggest publishers.

Since the suit was filed, many of the authors who’d protested the archive have deleted their tweets or released statements explaining they’ve changed their minds. Wendig, who initially appeared to be leading the charge, has since stated several times that he is not involved with the case. And on July 14, the Authors Alliance, an organization that helps authors to reach more readers, filed an amicus brief in the lawsuit on behalf of the Internet Archive.

One thing hasn’t changed: fears that the vagaries of this case could cripple the archive and, subsequently, the myriad services it offers the 1.5 million people who visit it every day. In addition to lending books digitally, the Internet Archive hosts the Wayback Machine, a tool that has chronicled internet history since 1996; the concern is that if legal costs drain the archive of its funds, all of its services could be affected. Users of the site and digital archivists have compared the potential loss of the archive’s services to the burning of the Library of Alexandria. Yet book companies also view the stakes here as existential for their business model; the International Publishers Association stated that this case is of “global significance” to its members.

If even authors themselves appear to be backing away from the battle, why are publishers continuing the suit—and what could it really mean for the internet and its most comprehensive archive?

Initially, publishing companies were suing the IA just over the National Emergency Library. But as Peter Suber, a former professor of philosophy and the current director of the Harvard Open Access Project, explained, “If [the lawsuit] were strictly about the National Emergency Library, it would be moot. … It would be dismissed.” After all, it closed in June 2020. The fact that the case has continued suggests that “the publishers want to block controlled digital lending,” he said.

In controlled digital lending, a library scans each individual page of a physical book that it already owns, uploads a digital copy, and generally allows one patron to check it out for a period of time, according to two members of the Authors Alliance, Rachel Brooke (who drafted the aforementioned amicus brief) and Dave Hansen. When the digital copy is checked out, the physical copy is pulled from the shelves. That way, only one person is reading one copy at a time. This normally applies to the IA’s entire copyrighted collection in the Open Library, which encompasses more than 2 million modern books. (It’s unclear how many people actually use the Open Library itself.) As Suber notes, however, for the National Emergency Library, this practice was dropped, meaning the archive wasn’t practicing typical CDL at the time.

Libraries that loan out e-books tend to use a different method: They buy an e-book through an app like Libby but only for a certain number of borrows. Once that number of borrows is up, the library either lets the contract expire or renews it for more borrows. As in CDL, one person can access one copy of the e-book at a time. Hoopla, a less popular app, functions the same way, but doesn’t have a limit on how many people can access the same e-book at a time.

From the readers’ perspective, it might not matter whether a library uses e-book loans or CDL. But it matters to those with a financial stake in the process. With e-books, publishers and authors are paid per individual read, but with physical books and CDL, they only see the profits of the initial sale. Many librarians see CDL as a solution to the hefty costs of purchasing e-books for their collections.

The Internet Archive argues that controlled digital lending is OK under fair use doctrine, which allows someone other than the copyright holder to distribute the copyrighted material without asking the holder for permission. (It also says that the National Emergency Library should qualify as fair use, too, “under the unique circumstances” of spring 2020.) As Brooke notes, so far, no court has ruled that fair use doctrine covers CDL. It’s not clear how many libraries already practice CDL or are waiting to implement it based on the results of this case. At least 45 institutions have signed in support of the official CDL position statement, but usage seems limited to research and university libraries, rather than public.

One key benefit of CDL is that it makes books accessible even if they haven’t gone through the e-book process—to both readers and writers. Lots of titles are only available in physical form and in limited numbers. Hansen calls this the “20th century black hole”: Many older books and media have never been digitized because the copyright hasn’t expired, and ownership has become so murky that publishers aren’t even sure if they have the rights to them anymore. CDL could eliminate that problem—and it could keep physical 20th century books from deteriorating, since loaning digitized versions keeps old print copies off the lending shelf. Even before COVID, many writers couldn’t access physical archives, either because of geographic barriers or print disabilities, which make it difficult or impossible to read physical texts.

The Authors Alliance hopes that if those sorts of benefits to authors are made clear, the court will be more inclined to consider CDL a fair use under the first sale doctrine. Suber feels that controlled digital lending’s biggest strength in this case is in its very name: the control. “The purpose of these controls is to lend digital books on roughly the same terms as print books,” he said. “And if publishers can live with the free lending of print books, then they should be able to live with the free lending of digital books.” And perhaps if the Internet Archive hadn’t dropped its own-to-loan ratio control, the publishers could have.

Academic writers, like the ones Authors Alliance represents, are known to “write to be read,” as Brooke described it: They want to influence scholarly discourse and share knowledge with the world. Since they need to research in order to write, it follows that they would support CDL. Fiction and commercial writers rely more heavily on their royalties, which might sway them to agree with the publishers’ argument.

But Tochi Onyebuchi, a former civil rights lawyer and author of several science fiction and fantasy books, says he doesn’t blame the Internet Archive for any lost revenue for authors, even from the National Emergency Library. “I think if there was any significant market impact on bookstores, it was probably COVID that did it, not the Internet Archive,” he said, adding that after the initial pandemonium in early 2020, people began buying books at record numbers. While authors certainly deserve a living wage for their work—in his words, “I like to eat, I like to have a roof over my head!”—Onyebuchi also noted that “the publisher gets paid before the author does, and the publisher gets paid a bigger chunk than the author does.”

Despite these points, other experts Slate spoke with were doubtful that the archive could prevail, emphasizing that the dispute hinges less on philosophical points of view than on the unilateral actions the IA took. “There’s a settled case that the Internet Archive can buy books and scan them,” said Stephen Witt, a tech reporter and author of an acclaimed book on the history of music piracy. “The question is to what extent can they lend them out without the authorization of the publisher.”

Witt also pointed to legal precedent. “This fight was probably lost over 20 years ago,” said Witt, referring to cases like 2005’s MGM v. Grokster, in which the Supreme Court unanimously held that peer-to-peer file-sharing services like Napster could be sued for copyright infringement, as well as Authors Guild v. Google, which was filed the same year but only decided in 2015. That case kicked off the moment Google began scanning various publications for its Books feature, which Christian L. Castle, a Texas-based entertainment and tech lawyer who blogs at Music Technology Policy, identifies as the moment “all these disputes started” around the digitization of physical books began.

The Authors Guild lost in that particular case, however. Google was making only snippets of each publication available for search purposes, so Books was a technical fair use case. The IA, on the other hand, offers access to entire volumes—and in the case of the National Emergency Library, fewer barriers to access. Still, Witt agreed with Onyebuchi, saying, “I don’t think that this really costs the publishers very much money.”

At the end of the day, the money is only part of the legal equation, as UCLA professor John Villasenor told Slate. “The impact on the market is only one of the four fair use factors that courts must consider,” he said. So even if business impact is determined to be negligible in all projections, “the outcome is far from assured.”

Should the courts rule in the publishers’ favor, the best-case scenario for the Internet Archive is that this is just a narrow check on its power, another legal limit to be aware of. A more worrying result could be that other copyright holders—periodical and website publishers, music labels, film and TV studios—could decide to pursue the archive themselves, to configure a new legal precedent to their favor, and either take down everything they’d like or further hurt the IA’s finances. After all, as Castle said, “it’s not just books” that the archive stows: “It’s recorded music, it’s lyrics, it’s sheet music, it’s whatever you would find that’s in a library: sound recordings and various kinds of writing. All of those have different copyright treatment.”

Whichever way the case swings, the core fear of many netizens—that this suit could be the end of the Internet Archive as we know it—may be overblown. Terry Hart, general counsel for the Association of American Publishers, emphasized that the only targets of this suit are the Open Library and the 127 works from the plaintiffs’ companies that were uploaded to the archive—not the Wayback Machine or any other types of files scanned to the IA. Hart additionally noted that the requested compensation of $150,000 would add up to more than $19 million. Looking at IRS filings, the IA’s total annual revenue has exceeded that amount each year since 2018. The bulk of its earnings come from charitable contributions, which in 2019 alone racked up to $30 million. Yes, running the IA costs money—a lot of money. But it seems the archive could, with the help of donors, cover $19 million. (In late 2021, the IA already used the specter of legal costs to run a vigorous fundraiser drive.)

But if the publishers do win, will authors and creators themselves actually benefit? While authors Slate spoke with have supported the Internet Archive, the Authors Guild released a statement in favor of the publishers in July; in a recent email, it referred to the Open Library as “morally bankrupt.”

“We offered to work with Internet Archive in 2017 to create a licensing system that would make Open Library compliant with the copyright law, and that offer was rejected,” Authors Guild CEO Mary Rasenberger wrote to Slate. “Internet Archive’s unwillingness to work with authors and publishers to make their program legal unfortunately made a lawsuit the only recourse.” The guild may not speak for all of its members, but it does have some pretty well-known names in its ranks, including Sherman Alexie, Judy Blume, and Alexander Chee.

Onyebuchi expressed surprise that publishers are feuding with the Internet Archive at all, rather than working together. “Digitization and the onlineness of things is simply a fact of life,” he said. “The Internet Archive is a resource and it could have been used as one by publishers.” He suggested the archive could further aid authors and their publishers by adding links on CDL books to the authors’ backlogs, or to similar books in a “if you liked this, try that!” fashion. In this sense, the Internet Archive might have become even more beneficial to publishing companies than print libraries.

If the Internet Archive does lose the case, Suber believes it would put an end to CDL now and in the future: “We will not be legally allowed to take full advantage of the affordances of the Internet for sharing literature, and that would be a tragedy.” The Internet Archive would obviously have to shut down its practice, and any public libraries using or considering CDL would draw back, too. Ultimately, that will hurt researchers and readers alike.

More symbolically, a loss for the archive here would mean yet another blow to the techno-optimistic, quasi-libertarian vision that governed the web’s early years: virtual spaces free from elite control and disruption, an online society based on sharing and discourse over profit. “This is not a direction that the internet has ended up going, this kind of free-to-use, nothing-costs-anything utopian vision,” Witt said. “The powers that be made that go away, and this is just a cleanup action in a fight that was won long ago.”

Future Tense is a partnership of Slate, New America, and Arizona State University that examines emerging technologies, public policy, and society.