The Googlization of Everything (Siva Vaidhyanathan) » p.20 » Global Archive Voiced Books Online Free

The googlization of ever.., p.20

The Googlization of Everything, page 20

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32

• Google agreed to offer (with strict controls on the ability to print and share) full-text copies of certain out-of-print books for sale as downloads.

• Google undertook to offer much better access to many out-of-print works still under copyright. Before the settlement, Google offered largely useless excerpts of these texts. The settlement provided for much richer and broader access.

• Google agreed to provide designated computer terminals in U.S. libraries that would offer free full-text, online viewing of millions of out-of-print books. Google would forbid printing from these terminals, but users would be able purchase electronic copies of the books from these terminals.

Compared with the severe limitations on user access to most twentieth-century works under the original model for Google Books, this new model promised to improve the service substantially. In addition, the settlement aimed to avoid the threat of the great copyright meltdown outlined above. Clearly both sides saw real risks in forcing a courtroom showdown. However, back when Google introduced the library-scanning project as part of the Books program, many copyright critics celebrated the fact that a big, rich, powerful company was taking a stand to strengthen fair use. That never happened. Fair use in the digital world is just as murky and unpredictable as it was the day before the settlement. But what of the other problems and pitfalls inherent in the settlement? Critics of Google Books expressed serious concerns about a wide range of issues. Immediately after the announcement of the settlement, I asked Google’s legal department the following questions:

“Isn’t this a tremendous antitrust problem?” I asked. Google had essentially proposed a huge compulsory licensing system without the legislation that usually makes such systems work. In addition, this proposed system excludes many publishers, such as university presses, and authors who are not members of the Authors’ Guild. More important, this system would have excluded the other major search engines and the one competitor Google has in the digital book race, the Open Content Alliance. Wouldn’t these parties have had a very strong claim for an antitrust action?21

The Google legal team did not believe that this agreement was structured in such a way as to exclude others from developing a competing service. The agreements with and about publishers, libraries, and the registry were all nonexclusive, as is typical of Google’s approach to competition in the Web business. The registry would be started with Google funds, but it would be an independent nonprofit entity able to deal with the Open Content Alliance and other services without restriction from Google. Generally, Google’s lawyers did not see this service as presenting a typical antitrust problem. There are so many segments to the book market in the world, including real bookstores, online stores such as Amazon.com, and used-book outlets, they claimed, that no single entity or sector can set prices for books (even out-of-print books) effectively. There are always competing sources, including libraries themselves.22

“But isn’t this a potential privacy nightmare for libraries?” I asked. Would Google compile personally identifiable information from users of its free terminals (for example, by requiring them to log in to Google Docs or some other service)? Would Google collect search and usage data from these library terminals to “improve” searches? Would such data be open for study by publishers or media scholars? How long would Google retain such data if it were compiled?

The response from Google’s lawyers, in November 2008, exhibited a willingness to examine this potential problem. They indicated that much about the design of the program was yet to be determined. Google had not agreed to share personal information with publishers, but the company might share aggregate data collected through the service. And although Google had not yet designed the system, the legal department predicted that users would not have to log in to Google to use the public terminals. The legal department assured me that the company would “build in privacy protections” with the guidance and assistance of the library partners.

SELLING OUT LIBRARIES AND CORPORATE WELFARE

The strongest criticism of the Google Book program had always concerned the actions of the university libraries that have participated in this program, rather than Google itself or the effects of the program on libraries in general. The advantages to libraries of the settlement are twofold. First, they might face much less legal risk by permitting Google to scan books in their collections that are still protected by copyright (although future lawsuits by authors and publishers who live outside the United States, Canada, Australia, and the United Kingdom—the only countries covered by the settlement—remain a risk). Second, because Google has pledged to place designated terminals in public and university libraries across the United States, many libraries that never had the funds or space to build large collections of works would be able to offer their users greatly expanded access to electronic texts.

But the negative effects of these changes could be significant as well. Libraries might choose to remove physical books from their collections if they considered electronic access via Google to be sufficient. Of greater concern is the fact that every library in the United States might soon have what is in effect an electronic book-vending machine, run by and for Google, operating in an otherwise noncommercial space. Every library would soon be a bookstore. The commercialization of libraries and academia is not a new story, but it remains a troubling one. Inviting Google into the republican space of the library directly challenges its core purpose: to act as an information commons for the community in which it operates.

Companies such as Google should always do what is best for them. But libraries, and especially university libraries, have a different, more altruistic mission and clear ethical obligations. From the beginning, Google Books has seemed to be a major example of corporate welfare. Libraries at public universities all over this country (including the one that employs me) have spent many billions of dollars collecting these books. Now libraries are offering these books to one company that is cornering the market on online access. They accepted Google’s specifications for the service uncritically, without concern for user confidentiality, image preservation, image quality, search prowess, metadata standards, or long-term sustainability. They chose the expedient way, rather than the best way, to extend their collections. They have been complicit in centralizing and commercializing access to knowledge under a single corporate umbrella.

Under the rejected settlement, elements of library collections would have been offered for sale through a private contractor. Perhaps this change is only a matter of degree, but perhaps it is instead a major mission shift. Ultimately, we have to ask, is this really the best possible system for extending access to knowledge?

The privatization of some library functions is not necessarily a bad thing. We should not pretend that libraries operate independently of market forces or without outsourcing many of their functions to private firms; but many of the thorniest problems facing libraries today are a direct result of rapid privatization and onerous contract terms. There are too many devils in too many details.

Even by offering apparently benign services like free library access to electronic texts, Google serves its own masters: its stockholders and its partners. It does not serve the people of the state of Michigan or the students and faculty of Harvard University. The main risk of privatization is simple: libraries and universities last, but companies wither and fail. Should we entrust our heritage and collective knowledge to a company that has been around for less than fifteen years? What will happen if stockholders decide that Google Books is a money loser or too much of a liability? What if they decide that the infrastructure costs of keeping all those files on all those servers are not justifiable?

The early celebration of Google’s library project revealed an unfounded and unfortunate assumption: that the role of the librarian in the global digital information ecosystem is superfluous. It also ignored serious quality-control issues. Google has never publicly discussed the principles on which the book search engine will operate. In contrast, librarians and libraries operate with open and public standards for metadata and organization. Metadata—data about data—is particularly important.23 Without metadata—such as subject headings, keywords, and quality indicators—embedded in the files, a search for books about the Holocaust is just as likely to yield books denying the event as examining it. Good metadata standards generate better search results. Poor metadata standards can yield ridiculous or dangerously misleading results.24 So far, we have no reason to believe that the transfer of this indexing function from a public university library to a private entity will involve good or open metadata standards.

COPYRIGHT AND THE PRIVATIZATION OF KNOWLEDGE

Now that Judge Chin has rejected the settlement agreement between the publishers and Google, we are back where we were in 2008, when Google was mounting a fair-use defense against the publishers. This time, however, Google will have a harder time convincing the public and courts that it has the right under fair use to continue to scan the contents of libraries for its own use. By settling the lawsuit with publishers (and thereby surrendering its claims that the wholesale scanning of books is fair and legal), Google has managed to lock in a tremendous advantage for itself. No other institution could reasonably have pursued a massive scanning project in the knowledge that publishers would sue right away; and no other entity would be able to compel the plaintiffs to settle on terms anywhere close to those Google has negotiated. So, regardless of the disposition of the settlement, unless we reform copyright to allow more innovative uses of material that is now sitting on shelves, underused, we are stuck with the Googlization of books and nothing more. If Google shuts down the project out of fear of losing a monumental judgment in court, then we will be stuck with much less access than we have today.

The music-downloading controversy of the early 2000s provides an introduction to the parameters of these issues. Peer-to-peer music downloading was described by music copyright holders as the greatest threat to the historically successful copyright system and all the industries that depend on it.25 The 2004 case MGM v. Grokster was expected to be the showdown over the issue.26 In an amicus curiae brief I wrote on behalf of media studies professors, I argued that there is no functional distinction between the peer-to-peer interface Grokster and the popular search engine Google.27 Both are search engines that facilitate the discovery, access, and unauthorized use of others’ copyrighted works. Both “free ride” on others’ copyrighted works. Both provide a service to the public for no direct remuneration from their users, yet both are commercial entities that benefit from increased traffic and the data gathered from their users. So if you hold Grokster liable for inducing infringement, Google’s Web Search service is liable, as well.28

Of course, there is one big difference. Grokster itself did not actually do any copying: it just facilitated copying by others. Google, by contrast, makes copies of all kinds of copyrighted material. For years, it has been making cache copies of the Web pages it indexes, because its search function cannot operate without a cache index. In two cases, courts ruled that this practice does not infringe copyrights.29

Copyright on the Web, however, works in peculiar ways. A series of important court cases in the United States gave search engines and other Web enterprises confidence to innovate.30 We could not navigate the Web effectively if Google and other search engines could not freely copy and cache others’ copyrighted material. Every time you post an entry on a blog or create a new Web page, you are granting search engines a presumed license to copy it. If you wish to opt out of the Web search system, you must act. The burden is on the copyright holder. Courts have ruled that if the burden were on the search-engine companies to ask permission and negotiate terms with every one of the millions of people who generate copyrighted content on the Web every day, they would simply quit, because the costs of doing business would be too high. And thus we would have no search engines, and the Web would be unnavigable.

By copying and caching actual physical books, Google is reaching beyond the Internet and the copying and caching of Web pages. In the real world, off the Web, a copyright holder must grant explicit permission to allow someone to copy an entire work for a commercial purpose. That’s how copyright has worked for three centuries: the burden of securing permission is on the party that wants to copy the work. The default is that everything in the real world is protected. The default on the Web is that everything can be copied.

Through its scanning program, Google had hoped to impose the copyright norms of the digital world onto the analog world. Publishers, accustomed to the norms of the real world and skittish about those of the Web world, panicked and sued.31 By provoking a lawsuit over Google Books, Google not only gambled the value of the company: in the words of the University of Pittsburgh law professor Michael Madison, it “bet the Internet” on this case. If the case goes to court and Google loses, an appeals court or the U.S. Supreme Court might write a decision that would undermine the rights of search engines in general to make cache copies of Web documents without permission. In that event, the very concept of a navigable World Wide Web would collapse. No company, not even one as wealthy and successful as Google, could afford the time, labor, and funds it would take to secure permission to copy the billions of text pages, images, and videos that Google now scours for its indexes.32 That is far from the outcome that copyright laws are intended to produce, yet it was the threat that their imposition posed in the instance of Google Books.

Copyright in recent years has become too strong for its own good. It protects more content and outlaws more acts than ever before. When abused, it can stifle individual creativity and hamper the discovery and sharing of culture and knowledge.33 But the Google scanning project threatened the very foundation of copyright law. Google had hoped to exploit the instability of the copyright system in a digital age by resting a huge, ambitious, and potentially revolutionary project on the most rickety, least understood, most provincial, and most contested perch among the few remaining public-interest provisions of American copyright law: fair use.

When it settled the publishers’ lawsuit, Google managed to avoid the fundamental issues of copyright by conceding that the company had no clear fair-use right to scan millions of copyrighted works just to display them on a restricted, yet commercial platform. But this was more than a dodge. Google vaulted over the copyright conundrum and exploited its own dominance as the chief search platform in the world to corner the market on electronic library searches and delivery. It was a bold move that raised as many hard issues as it settled.

By settling, Google engineered a better position for its commercial services than it would have had it won the lawsuit. To some observers, the slim prospect of Google prevailing with its fair-use defense was clear as early as 2004. Soon after the Google Books library-scanning project was made public, Paul Ganley, a London solicitor, wrote an analysis of the Google library case under both U.S. and U.K. law. He concluded that although Google had a slight chance to prevail under the flexible fair-use provisions of U.S. law, it had absolutely no chance of surviving a challenge in British courts. Ganley presents the case as a “teaching moment” because it generates two wonderful potential exam questions: Can Google do this under existing copyright law? Should Google be able to do this under copyright law?34

I added a third question to public debate about the project, one that spoke directly to the first two: Is Google the right agent to do this? If it is, then copyright law certainly should allow ambitious and potentially beneficial uses of copyrighted material that on balance do not threaten existing markets for works. However, it is possible that copyright law already allows other institutions better suited for these efforts to undertake them.35

Within weeks of the announcement of Google’s plan to scan library collections, I concluded that legally, politically, and practically, Google was not the right agent for the job. Instead, I argued, libraries should pool their efforts and resources to accomplish such massive digitization and access projects themselves. Because Google is such an inappropriate choice, its legal argument is inherently weakened: thus the answer to Ganley’s first question is no. However, by avoiding a courtroom showdown over the scanning project, Google’s actions injected uncertainty into the projects that other organizations might pursue. If public and university libraries were to team up to generate a similar service, would they be bold enough to create cached copies of millions of scanned books still under copyright? Would the existence of a new market for out-of-print books available from Google Books prejudice a court against a fair-use defense mounted by libraries? Answers to these questions depend on the answer to a more general formulation of Ganley’s second question: Under copyright law, should any entity be able to create cached copies of millions of scanned books still under copyright?

Back when it looked as though Google would mount a bold case to expand fair use, distinguished scholars and litigators such as Jonathan Band, William Patry, Fred von Lohmann, Cory Doctorow, and Lawrence Lessig all voiced enthusiasm for the Google project and launched defenses of the firm’s copyright strategy.36 Each of these writers relied on the traditional (and statutory) “four-factor” analysis of Google’s use of the works: the character of the intended use, the nature of the work to be used, how much of the work would be used, and the harm to potential markets.37 Each of them minimized the fourth factor, declaring that the Google project would not harm the sale of books and might enhance it. In addition, they concurred, several important cases in recent years have shown that commercially viable uses are not beyond the scope of fair use.38 All their arguments treated the snippet of text that Google users would encounter when clicking on a link to a copyrighted work as the operative use of the work and minimized the importance of the original scanning of the book—the very copying that the publishers wanted the court to consider as operational and significant. They argued that the snippet-based interface is “transformative,” thus invoking the magic word that Justice David Souter employed in his ruling for the hip-hop group 2 Live Crew in the case of Campbell v. Acuff Rose.39 In this view, by “transforming” the original song—Roy Orbison’s classic hit “Oh, Pretty Woman”—the defendant, Luther Campbell, created something entirely new—in this case, a parody of the original song.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32

The Googlization of Everything, page 20

Other author's books: