The presentation I attended on the Japan Memory Project which I covered in my last posting also discussed another part of their institute’s online efforts. Wakabayashi Haruko introduced us to their Online Glossary of Japanese Historical Terms which allows researches to search a database of (currently) about 21,000 pre-modern historical terms. The contents of the database itself is made up the glossary entries found in many English language (and later apparently other languages will be included) works on pre-modern Japanese history. For example, if you search for the term 天皇 the glossary will show you how seven different works, including the Cambridge History, have translated and romanized the word.

You can also enter whole passages, perhaps copied and pasted into their search box. However, their search algorithm does a poor job of separating the words as the algorithm is based on modern Japanese rather than classical. Although an audience member was hard on them for this, the truth is that such algorithms for even modern Japanese and Chinese are still full of errors. According to one Chinese language professor I heard present at a recent conference in New York, the careers of many bright programmers are dedicated to solving the difficult question of how to accurately divide words in texts without spacing.

UPDATE: The glossary seems to have moved links. The new home can be accessed via here: Access to the Japanese Historical Terms Glossary and other databases

