FW: Google- the book scanning project- Microsoft of the future other issues- the scenario gets murkier

The Google Library book scanning project has raised may issues in the world of information. Please read on the msg below for some more light (or is more dust in the cloud!) It makes an interesting read anyways - Shalini -----Original Message----- From: owner-ipr@mailhost.soros.org [mailto:owner-ipr@mailhost.soros.org] On Behalf Of Darius Cuplinskas Sent: 01 March 2005 14:02 To: ipr & public domain list Subject: [ipr] How Google will scan the world, 1 book at a time This project, although valuable, will not provide full-text access, and will entail severe restrictions even on "fair use" permitted by copyright law: "You can't print the book out or digitally cut and paste any of it, even for this, an out-of-copyright volume. The restrictions will be more severe on in-copyright volumes, essentially those printed after 1920, limiting searchers to seeing a line or two around their search term." Brewster Kahle of the Internet Archive is bringing together a group of libraries to organize a rival project which would allow full-text access and put fewer restrictions on in-copyright materials. DC -------- Original Message -------- How Google will scan the world, 1 book at a time Steve Johnson Chicago Times February 25, 2005 As Google prepares to create the world's most comprehensive digital library, it's getting harder not to think of the company as the next Microsoft, morphing from a friendly Internet helper with a cutesy name into an awesome and inescapable force of digital nature. Already the dominant search engine, the California technology company is testing Gmail, a free e-mail service that's likely to be a blockbuster, Google Maps, a rival to Mapquest, and Google Desktop, a function that allows users to search within their computers much more quickly than Microsoft Windows does. And then there's the small matter of accumulating vast chunks of humankind's recorded knowledge. Even as you read this, Google scanners are busy making bits and bytes out of books from the Harvard, Stanford, Oxford and University of Michigan libraries, as well as from the New York Public Library, a project that ultimately could total more than 57 million volumes. The company's goal in all it does, it says, is "to organize the world's information and make it universally accessible and useful" (along with, you know, selling ads alongside the world's information). But another goal seems to be ubiquity, and as Bill Gates has learned, when you become unavoidable, you also become resented. Altruistic project For now, though, Google remains mostly well-liked because the core of its business, the search engine, is so good. And the book-scanning project it's taking on seems more altruistic than not because Google is bearing the enormous (and, of course, unspecified) cost of copying the books into the digital world. "There are [other] people who had that vision," says Sidney Verba, director of the Harvard University Library. "What Google had was the vision and the ambition, the technical skill. ..." Shifting to a whisper, he adds the most important factor: "a lot of money -- in some sense I think more than they knew what to do with. .. For these kinds of people to invest millions and millions of dollars in it, it is a good exploitation of the profit sector for public purposes." The University of Michigan says its pre-Google digital collection of 21,000 volumes is among the country's "most ambitious." But making it -- actually placing the books, page by page, onto scanners and then making sure the result is clean and accurate -- is very slow, hard work. "At its current rate of digital production," the school explains in a press release, "it would take the university more than a thousand years to digitize the 7 million volumes in the collection. Google plans to do the job in a matter of years." Beyond vague talk about Google having developed a much more efficient process, the project's specifics are secret. At Harvard, for instance, Google won't allow reporters to visit or photograph the scanning currently being done -- of 40,000 volumes as a kind of pilot project, just to make sure the books don't get damaged or lost -- at the university library's 5-million-volume off-campus storage facility. But the aims seem transparent enough. It will bring to the masses these great research institutions, full of books one would normally need a plane ride and permission to access, and make them as easy to search for and within as a particular city's restaurant listings. "The company as a whole has been really excited about it," says Susan Wojcicki, Google's director of product management, in part because it relates to the company's roots. "The founders were working on a library digitization project when they wound up creating a search engine that today is called Google. "We're just really excited to be working with these institutions as well. A lot of them have been around for hundreds of years," adds Wojcicki. Google's age: 6 1/2. Libraries will get copies The books will be gathered by Google in the Google Print company subdivision, known previously for trying to get publishers to put their current books in print up for public perusal. The libraries will also get their own copies of their texts turned binary. "We're very anxious to make sure this is of real service to our own users and a public good, as well," Verba says. "We're very sensitive to not having somebody come back and say, `Look, you've just turned over to a monopoly something that should belong to the world.'" The project, almost everybody agreed at its December announcement, holds enormous promise. Scholars will be able to learn, at the press of a button, which books have discussed Francis Bacon, for instance, and in what manner. Journalists and bloggers can add books to their research repertoire; previously, they have used mostly other journalism, the already-digitized and quickly searchable record of newspapers, magazines and some television shows. "There'll be more of that contextual information which you'll be able to get more readily," says Harvard's Verba, "How that'll change the way people think, I don't know, but it's really exciting. "It will give us this huge digitized file of our books to do things with that we don't know yet what they are. Searching text is really a thing that's on the frontier. As it evolves, we will have a text to search. We are thinking of it as a very valuable resource for things that are not 100 percent clear." Ordinary users can see the project at work already. Type "books and culture" into the company's familiar search interface, and you get access not only to all the Web pages Google has indexed that contain those words, but also, right near the top of your search, to one of the early books to be scanned in, an 1896 tome of that title by Hamilton Wright Mabie. Right away you notice that there are no ads on the page, and there are no plans to add them "at this time," Wojcicki says. You may also notice there are links to help you "Buy this book" and to "Find this book in a library." No printing allowed You can't print the book out or digitally cut and paste any of it, even for this, an out-of-copyright volume. (The restrictions will be more severe on in-copyright volumes, essentially those printed after 1920, limiting searchers to seeing a line or two around their search term. But Stanford law professor Lawrence Lessig, writing in the Los Angeles Times, questioned the right of Google to make available even "snippets" of copyrighted material.) Verba says you can certainly make notes on it, and that a print-on-demand feature is worth considering. In the meantime, want to know whether Mabie's "Books and Culture" contains the word "ominous"? Type the word into the "Search within this book" field. There it is on page 193, in a sentence praising the ability of the cultured man to recognize the characteristics of his era. The thing that's really encouraging, Verba says, is that the project turns the current fears of scholars and parents on their ear. "There's no academic and in fact there's no parent with a teenager who isn't worried about the fact that that generation will believe that all knowledge is on the Internet and on Google and will never want to open a book again," he says. "The nice thing about this project is that it's a kind of, `If you can't beat 'em, join 'em.' People will go to Google, and they will find books, and they will have to then go to the library and get the books. We think this is really a nice way of squaring that circle." ----------
participants (1)
-
Shalini R Urs