Four publishers1 have finally taken their beef with the Internet Archive to court. They want to shut down the Archive's digital Open Library of 1.3 million books, most fairly recent in publication but not too recent (late 20th century to early 21st). Most of the books are still well under copyright protection, but the Internet Archive is paying no royalties to the publishers or authors. Publishers don't much like this arrangement.
The Internet Archive did shut down their recent National Emergency Library, opened to serve more readers during the stay-at-home coronavirus epidemic, earlier than planned, but not their main Open Library.
The Open Library was begun in 2006 by the Internet Archive. The plan is to eventually scan every book in existence and make them freely available to the public. As such, it is even more ambitious than Google Books, which aims only to make “snippets” of books still under copyright protection (published after 1924) free to the public. Google limits available text of under copyright books as part of an agreement with publishers and authors. The Internet Archive has no such arrangement and does not believe one is needed.
Here is how it works. The Internet Archive collects books in large quantities and scans them. It then lends these digitized copies to the public at no charge. These aren't like e-books, created specifically for digital distribution, but photocopies of the pages of printed books. As such, they aren't quite as easy to use as true e-books, but like Google Books, the price is right. However, rather than posting them online so everyone can read them at once (like Google Books), the Open Library lends you an electronic file. It lends no more files at once than it has copies of the book in inventory, effectively placing the same limit on themselves as physical libraries experience. It is a process the Internet Archive calls “controlled digital lending.” The publishers, in their lawsuit, describe “controlled digital lending” as “a manufactured legal paradigm, conceived by IA, to cast aside well-established copyright jurisprudence.”
The Internet Archive's National Emergency Library did not limit the digital copies it loaned to the physical copies it possessed, like the Open Library. Their justification was that the coronavirus stay-at-home orders created a temporary emergency situation. However, as previously noted, the Internet Archive has closed down this uncontrolled digital lending.
What we have here is a collision between two well-established legal doctrines. One is copyright protection, provided for in the U. S. Constitution, codified in law. The other is the long established legal doctrine, handed down through the courts, known as the “First-Sale Doctrine.” This is what allows you to give a book you purchased to a friend, or sell it to a used books store, without running amok of the copyright law. This is also what allows libraries to loan you a book they have purchased without having to pay anything further to the publisher. It says that once there has been a first sale of a book, you can do with it as you please. No additional payments are required. It is this doctrine that the Internet Archive argues applies to their “Open Library.”
Now, copying a book is not something protected by the First-Sale Doctrine, copyright law taking precedence over this usage. However, the Internet Archive's position is that lending a photocopy of a book they own is essentially the same as loaning the physical copy, so long as they have the physical copy in their possession and don't loan that out. This is what they call “controlled digital lending,” which, in their opinion, is simply an update of the First-Sale Doctrine to current technology. The publishers argue that this is still copying a book, a violation of copyright law.
There are some exceptions to the ban on copying copyrighted works that the Internet Archive may attempt to argue, beyond just saying this is effectively the same as loaning the printed copy. One is a doctrine known as “Fair Use.” This is what lets you quote from a copyrighted book for the purposes of a book review, or to support or challenge an author's opinion. It is also what protects parody. The test here is whether it is a “transformative use,” a book review, for example, being something different from the book itself. Perhaps the Internet Archive will argue that the vastly different digital format is a “transformative” use. However, while there is no stated limit as to how much of a work may be copied under Fair Use, it is generally believed it is some limited portion. Copying an entire work, word for word, may be a hard sell under the Fair Use doctrine.
Another exception is found under section 108 of 17 U. S. Code, Limitations on Exclusive Rights: Reproduction by Libraries and Archives. There are certain conditions which must be met, and the legalese isn't always that easy to follow, but it seems unlikely that it was meant for a situation such as this, though that doesn't mean it does not still apply. There are several arguments the publishers make why this exception isn't applicable, such as it only applies if the copier is not a commercial enterprise. They argue that the non-profit Internet Archive is a commercial enterprise as they do some scanning for others for a price and that the founder is also owner of Better World Books, from whom they source some of their books. These strike me as not very strong claims.
However, this exception only applies if the work is “duplicated solely for the purpose of replacement of a copy or phonorecord that is damaged, deteriorating, lost, or stolen, or if the existing format in which the work is stored has become obsolete.” Since these books aren't damaged or missing, the remaining claim must be that the existing format has become obsolete. That may be a stretch for printed books, but perhaps the ease of obtaining a digital copy anywhere in the world, which could be very difficult for a physical copy, might lead one to declare printed books to be “obsolete” today. Unfortunately for IA, the code describes “obsolete format” as “a format shall be considered obsolete if the machine or device necessary to render perceptible a work stored in that format is no longer manufactured or is no longer reasonably available in the commercial marketplace.” I don't know what type of device is needed to read a printed book. Perhaps eyeglasses, but those are still readily available.
Even if the Internet Archive makes that claim successfully, there is another roadblock it must overcome. It must also show “(1) the library or archives has, after a reasonable effort, determined that an unused replacement cannot be obtained at a fair price; and (2) any such copy or phonorecord that is reproduced in digital format is not made available to the public in that format outside the premises of the library or archives in lawful possession of such copy.” One can argue #2 either way, whether the offering is technically made inside or outside of the library, but #1 is harder if the book is still in print, or available as an e-book. If out of print and not offered as an e-book, the Internet Archive's case is stronger, an unused replacement unlikely available at a reasonable price, but that still leaves the question of whether printed books are an “obsolete format.”
Who will/should win this case? Courts sometimes attempt to mathematically analyze the wording of the law to discover what may be indecipherable – the exactly objective meaning. Other times, they will attempt to decide what is just and then interpret the words to reach what they conclude is a just decision. The Internet Archive provides some wonderful services, not just free books, but audio recordings, old television news programs and software. They also produce that incredible tool, the Wayback Machine, which preserves earlier versions of websites or defunct ones. Their cause may be just. However, the reason for copyrights is to provide an incentive for writers to write. If they can't get paid, they will not write. While traditional libraries, though protected, can potentially reduce sales, each one has to buy its own copies and only serves a small geographic area. The Internet Archive needs to obtain only one or a few copies to serve the entire world, and even then, they pick up used copies so the authors and publishers get nothing. They need protection, and that is just and fair.
However, there are also many older books long out of print, no longer obtainable outside a small number of libraries if that. Authors, long dead, and publishers no longer printing them if not completely out of business, will never sell another copy. Their copyright no longer serves the purpose of encouraging writers to write, and yet access to the book may be virtually unobtainable because it is still under copyright. If the book is out of print, and no longer earning royalties for its author, the public good is better served by allowing someone like the Internet Archive to make it available free to the public.
1Hachette Book Group, HarperCollins Publishers, John Wiley & Sons, and Penguin Random House.