Higuchi Kōichi
Ritsumeikan University. Associate Professor

Statistical analysis of Japanese textual data using PC : developing free software KH Coder

This presentation introduces how to perform statistical analysis of Japanese textual data using free software called “KH Coder” which I am developing, along with actual analysis examples.

When a researcher collected many Japanese texts, as the first step of analysis, she / he may want to get an overview on the content. Statistical analysis approach using computers is very suitable for this purpose. By automatically and mechanically counting words or concepts using a computer, you may notice, for example, that "This word appeared more frequently than I thought". Overviews of data can be useful by itself and allow us to discover features of data that have not been previously observed.

Also, if the appearance of a specific word or concept suddenly increases or decreases among statistical plots, there may be some changes in tone at that part of the data. If such a part is found, researchers may notice something new about the data by reading that part carefully. Thus, statistical analysis suggests which part of the data is considered to be important and which part of the data is to be interpreted in detail by researchers.

I have been developing and proposing this statistical analysis approach and the software mainly in the field of sociology. I will be very grateful if participants let me know if the software can be helpful in their research fields or how should I improve it.

コンピュータを用いた日本語文章型データの統計分析: フリーソフトウェアKH Coderの開発

本報告では、日本語の文章データを収集・分析しようとしている研究者に向けて、統計的に文章データを分析する方法を、分析事例とともに紹介する。また、この方法を実現するために筆者が開発している分析用フリーソフトウェアKH Coderについても紹介する。