The Google book settlement — which the search giant signed with the Authors Guild and the Association of American Publishers in 2008, after a dispute over the company’s scanning of books — was struck down by a judge this week as too far-reaching, which is arguably true (although Google would undoubtedly disagree). But the fact that the arrangement has been rejected might not be such a bad thing, because it puts the spotlight back where it should be: on the fact that Google is doing nothing wrong — legally or morally — in scanning books without the permission of the authors or the publishers of those books.
Just to recap, Google started scanning books sometime in 2002, as part of its expressed desire to “index all of the world’s information.” In addition to deals with certain publishers and various university libraries — deals that are not affected by the book settlement or the legal ruling — Google also began sourcing and scanning books that were either in the public domain or were “orphaned” (a term used to refer to books that are still under copyright, but whose author or publisher can’t be found).
So far, so good. But Google also started scanning and indexing books that were under copyright, and then offered authors and publishers the ability to “opt out” of the program and have their books removed. Some felt that this was a good bargain — especially since Google was going to help promote their books (by revealing them in search and at the Google Books site) and give readers an easy way to buy them. Others, however, said that scanning and indexing their books without explicit permission was wrong, and filed the lawsuits in 2005 that led to the agreement.
The crux of this argument is that scanning a book makes a copy of that book, and that copying is not permitted unless a copyright holder specifically agrees. The authors and publishers made this argument despite the fact that Google only ever shows a small fraction of a text when they display a book online. It’s not as though the company planned to make copies of all books freely available to anyone through some kind of Google Books version of Napster. But the plaintiffs argued that simply scanning them was bad enough.
This is a ridiculous position, and always has been. Scanning something makes a copy of it in the same way that my viewing a web page makes a copy of it in the RAM of my computer — I’m surprised that authors and publishers haven’t tried to argue that this is secondary copyright infringement as well.
The reality is that Google’s use of selected extracts from books or any other work is protected by the principle of fair use (PDF link), which allows anyone to make use of published content of all kinds (text, images, etc.) without asking for permission from the creator or the rights holder. It’s the same principle that allows Google to index and show search results for images, web pages and other content without having to ask every single site publisher or photographer. Fair use requires that the user of the content meet the so-called “four factors” test, but Google arguably passes all four.
Why is this important? Because without that ability, search engines as we know them couldn’t exist, and they are a positive force for society as a whole — just as having a single way to search (and buy) every published book in the world would be a positive thing. Imagine if we were setting up public libraries now: would any author or publisher agree to have copies of their books just sitting there on shelves, for free, with anyone allowed to borrow them for as long as they wanted to? Unlikely (and e-book publishers like Amazon are trying to roll back borrowing abilities for digital works as well).
The big problem with the Google book settlement, as noted by the judge who struck it down (PDF link), is that the settlement gave the web giant the exclusive right to do whatever it wished with all scanned works, including selling orphan books, which is arguably over-reaching. But that doesn’t change the fact that Google’s initial impulse was the right one: it does have the right to scan and display extracts from books, regardless of what the Authors Guild and the AAP say, and it should continue doing so.
Post and thumbnail photos courtesy of Flickr user Timetrax23
Related content from GigaOM Pro (subscription req’d):
- Content Farms: The Players, The Benefits, The Risks
- Why Viacom’s Fight With YouTube Threatens Web Innovation
- Putting Big Data to Work: Opportunities for Enterprises