Ogiso Toshinobu
National Institute for Japanese Language and Linguistics. Professor

『日本語歴史コーパス』 ver. 2018.9




“Corpus of historical Japanese” ver. 2018.9

At the National Institute for Japanese Language Studies, we are constructing the "Corpus of Historical Japanese " as a diachronic corpus where you can study the history of Japanese from the Nara period to the Meiji and Taisho eras. This corpus enables advanced search by annotating word information to the whole sentence. It can be used online through the search service "Chunagon" (https://chunagon.ninjal.ac.jp) for free of charge.

So far, we have corpused materials from each era, from Manyoshu in the Nara period to magazines in the Meiji and Taisho era. Every year, we publish 2 to 3 sub-corpora: in March this year, we published "Muromachi period series II Christian materials" and "Edo period series I Sharebon". In addition, we are planning to publish Kokutei-tokuhon (national book reader) in this September.

In this presentation, we report on the features of this corpus and the latest information of construction. In addition, we will explain the value of the newly released material and its usage. Specifically, we will explain the "Christian materials" which includes original Portuguese Roman alphabet text and Kanji-Kana Japanese text, and "Sharebon" and "Kokutei-tokuhon" linked with original text image data on the Internet.