When the Japanese National Diet Library started putting Meiji period and Taishō period books online and fully viewable in their Modern Digital Library (近代デジタルライブラリー) I remember thinking, “Wow, this is amazing! If only there could be access to books in other languages on this scale!”
That collection now has over 150,000 books scanned and included in their database. You don’t need any special plug-ins and the page images are JPEGs. Great job!
This past week I have been doing some heavy lifting research without any library access and Google Books has once again showed itself to be a real friend. I have been able to look up things so fast, with such precision, and check even small obscure details with such ease from a kitchen in Sackets Harbor New York that I’m incredibly tempted to abandon my study of the 1930-40s and never again touch a subject which goes past 1920: why? Because there is a good chance that if you search for something Google Books has before 1920, it will be in full view and you can read, search, and download to your heart’s delight. There are exceptions, which I have complained about on numerous occasions, but still, each time I sit down and really do some heavy searching with Google Books I find an ever increasing availability of even quite obscure works in their database scanned from some of the best libraries around. The limited preview is also incredibly useful as I increasingly look things up with a quick search on Google Books instead of picking up that same book on my table half a meter away. When one knows certain tricks, the limited preview is not even that limited when you really need to read a few pages denied to you.
The internet is now filled with debates about what the Google Books settlement will mean for publishers, writers, and researchers, as well as casual readers on the internet. I don’t want to fight that fight here, but I will point out one obvious fact:
The 近代デジタルライブラリー now looks like something out of the stone age compared to the interface provided in full view on Google Books. It is downright painful to go back. It is like going from the web back to the world of gopher on a dial-up connection. It is slow to load each page and single page display. It isn’t just that Google has the money to put a lot of effort into its presentation. To be sure, it isn’t trivial to create a web based reading experience which allows you seamless scrolling while pages load in the background, and the host of other little features they have included.
However, they decided early on that if they will give you full view, they are going to give you full view: allowing PDF and ePub downloads (albeit watermarked and not searchable offline).
A lot of databases like 近代デジタルライブラリー or the アジア歴史資料センター have a completely different philosophy, even for works that have long been in the public domain: sure we will give you a whole page but only zoomed out. If you zoom in we’ll give you a little piece of it in JPEG form. Multi-page download? In the latter case, no way, in the former case, they can create a special PDF for you, with a limited number of combined images:
I see how this is designed to restrict the bandwidth usage on an already slow (at least in the US) website, but this tells me that there needs to be a greater pooling of efforts – either with help from powerful private sector companies such as Google (with care to avoid some of the problems this produces, and even worse horrors of such disasters as Footnote.com) or by pooling resources between governments, or in cooperative agreements between governments and the private sector.
Side note: Google Books has a small number of old Japanese books scanned from US libraries. It has Chinese books too but many of these were affected by complaints from Chinese authors and now have little or no access. Unfortunately many of these books are backwards: page numbers don’t work properly and the pages are shown in reverse in many (but not all) old books I have looked at in the past few days. Google: if you unbind Japanese books and present them in a vertical scrolling interface, you will have reverse the order of the pages!