井の中の蛙

9/23/2006

Google Books: PDF Download Feature

Filed under: — K. M. Lawson @ 11:15 pm

The Google Books project is an exciting new chapter in the world’s digitization of printed materials together with the Gutenberg project. I have blogged at Frog in a Well – Korea about some old English-language works on Korea that are available for download in text form from the latter. On my own weblog I have expressed some frustration with the limits imposed by Google Books on the viewing of works which are not protected by copyright here.

There has been a recent piece of news about the Google Books project which was announced on the Google Books own weblog here at the end of August. Many books that can be found on Google Books, which are out of copyright (or rather, which Google has decided to treat in that manner), can now be completely downloaded in PDF format.

Some notes about this feature:

1) The downloaded work is an image PDF, usually 1-15MB in size. The text metadata for each book is not in the downloaded document. This means you cannot search for text within the document once it is downloaded, but must return to Google Books in order to search the contents.
2) Some books which a) are no longer protected by copyright b) Google recognizes as no longer being protected by allowing you to browse an unlimited number of pages from the work are strangely not available for download. For example, Miyakawa, Masuji’s My Life in Japan, published in the United States in 1907 can be fully viewed online and is not protected by copyright, cannot be downloaded as of today. The same goes for Bushido, the Soul of Japan: An Exposition of Japanese Thought by Inazô Nitobe published in 1905 (the 10th edition)
3) Many of the old books, especially those which cannot be downloaded despite their lack of copyright coverage, have huge “Image Not Available” error messages where the pages should be. Strangely, you can still search the text metadata for these books and return results. Clicking on the search result pages, however, will simply show “Image Not Available.” Other books have some pages missing but some showing.
4) As I have discussed elsewhere, some books which cannot possibly be covered by copyright are only shown in “snippet mode” and in some cases, searching their contents returns completely unexplainable and mistaken results. For example, the 1910 Highways and Homes of Japan by lady Kate Lawson is bizarrely shown only in snippet mode and as this snapshot shows, searching for “Japan” within the book gives completely wrong results.
5. The page images for tables of contents are in many cases hyperlinked. You can click directly on chapter titles in the table of contents to jump to that chapter.

How to search for books related to Japan that are out of copyright:

The easiest way is to search for something specific on the Google Books web site. However, that will return mostly results that are still protected by copyright. See this excellent summary of copyright protection at Cornell for how to determine roughly if something is protected that was published in the United States. All things published in the United States before 1923, regardless, are now in the public domain, no exceptions. There is no reason Google should restrict access to those materials insofar as it assumes visitors are viewing the content in the United States (its website says as much in its warning to those outside the US).

IN TITLE – If you want to search for something in the title, either use the “Advanced Search” link or simply precede your search with “intitle:” For example: intitle:Japan or intitle:”Jinrikisha Days in Japan”

BY DATE – To restrict yourself to the period when all books are in the public domain, you can specify a date year range using “date:” So for example: date:1800-1922. You can also specifi “Full view books” in the advanced search page to see only results in books that can be fully viewed.

So searching for books with Japan in the title, published from 1800-1922 can be found by entering: intitle:Japan date:1800-1922

Some examples of books that can be downloaded, found merely through searching for Japan in the title, some of which you might recognize:

The Awakening of Japan by Kakuzô Okakura 1904

Glimpses of Unfamiliar Japan by Lafcadio Hearn 1894

The History of Japan: Together with a Description of the Kingdom of Siam, 1690-92 by Engelbert Kaempfer, Simon Delboe, Hamond Gibben, William Ramsden 1906 (at least this edition of it)

China and Japan: Being a Narrative of the Cruise of the U.S. Steam-frigate Powhatan, in the Years… by James D. Johnston 1860

Working Women of Japan by Sidney Lewis Gulick 1915

China Vs. Japan by the New York Chinese Patriotic Committee 1919

Japan by the Japanese: A Survey by Its Highest Authorities edited by Alfred Stead 1904

A Diplomatist’s Wife in Japan: Letters from Home to Home by Hugh Fraser 1899

A Handbook for Travellers in Central & Northern Japan: Being a Guide to Tōkiō, Kiōto, Ōzaka… by Ernest Mason Satow, A. G. S. Hawes 1881

Japan and the Japanese by Talbot Watts 1852

Hildreth’s “Japan as it was and Is”: A Handbook of Old Japan by Richard Hildreth, Ernest W. (Ernest Wilson) Clement 1907

Japan and the California Problem by T. (Toyokichi) Iyenaga, Kenoske Sato 1921

Grandmamma’s Letters from Japan by Mary Pruyn 1877

Problems of the Far East: Japan, Korea, China
By George Nathaniel Curzon 1894

8 Responses to “Google Books: PDF Download Feature”

  1. Charles says:

    I immediately ran into the copyright problem. I downloaded the Lafcadio Hearn book you linked, and was presented with a wonderful full scan of Volume 2 courtesy of the Harvard Library. But Google has no copies of Volume 1 that are not copyrighted.
    It would seem obvious that Google should have scanned both volumes from the Harvard Library, rather than the Kessinger Publishing reprint. As this stands, any publisher that misrepresents its reprints as copyrighted can hold a book hostage.

  2. K. M. Lawson says:

    Yes, I have discussed this really annoying problem at the blog entry I linked to at the beginning. I can only hope that Google will eventually set up a simple method for contesting the protected nature of clearly unprotected books, something which I would believe to be in the interest of the various libraries which have cooperated with Google and released their books. I cannot understand what gives the right to Kessinger to copyright its exact reprints (not even reset so they can claim some kind of creative process in their reprinting) of the works.

  3. K. M. Lawson says:

    I have just contacted Kessinger publishing and asked them to comment. I propose that we create a petition to submit to Google Books, but I suggest we wait until we hear back from Kessinger and their explanation on this issue.

  4. Roy Berman says:

    I am unable to view any of these books at all. I get nothing but a one line summary and page count, with links to some bookstores on the side. Is book search currently limited to the US? I’m in Japan, and I know these books would also be considered public domain here. I suppose the solution would be for google to check the copyright date in the metadata of each item against the local public domain cutoff year instead of just limiting access to the entire contents of everything.

  5. K. M. Lawson says:

    If it is true that they have found some way of excluding everyone outside the US, this is very troubling…

  6. It’s not Google’s fault, probably. It’s copyright law, and Roy’s right: there’s a legal solution. But you run into trouble, then, from US Intellectual Property interests, which don’t want international copyright standards (which are less stringent) applied to their US content, even overseas, so rather than do something clever which might get them into a fight, they’re doing something stupid which only offends…. people.

    You think this is dumb? Try applying the law to braille materials….

  7. Roy Berman says:

    I emailed google about this through the bug report form, and I got a fairly speedy reply saying that they are aware of the problem, and certainly hope to only restrict people as much as necessary in the future. I can understand why they err on the side of caution, being already embroiled in publisher lawsuits related to this service, but I do hope that they manage to work out these issues someday. At least with the PDF download feature, when I do go back to the US I can save all the interesting-seeming books I can find on my hard drive, in case I can no longer access them when I go abroad once more. And of course, since those books ARE public domain, one could legally redistribute them, say as a torrent.

  8. [...] I have at once lauded but also complained about severe flaws in Google’s book search in an earlier posting here at Muninn and also at a Frog in a Well posting. My two biggest complaints at this time are: [...]

Powered by WordPress