https://scholars.lib.ntu.edu.tw/handle/123456789/413120
標題: | Chinese word ordering errors detection and correction for non-native Chinese language learners | 作者: | Cheng S.-M. Yu C.-H. Chen H.-H. |
公開日期: | 2014 | 起(迄)頁: | 279-289 | 來源出版物: | 25th International Conference on Computational Linguistics | 摘要: | Word Ordering Errors (WOEs) are the most frequent type of grammatical errors at sentence level for non-native Chinese language learners. Learners taking Chinese as a foreign language often place character(s) in the wrong places in sentences, and that results in wrong word(s) or ungrammatical sentences. Besides, there are no clear word boundaries in Chinese sentences. That makes WOEs detection and correction more challenging. In this paper, we propose methods to detect and correct WOEs in Chinese sentences. Conditional random fields (CRFs) based WOEs detection models identify the sentence segments containing WOEs. Segment point-wise mutual information (PMI), inter-segment PMI difference, language model, tag of the previous segment, and CRF bigram template are explored. Words in the segments containing WOEs are reordered to generate candidates that may have correct word orderings. Ranking SVM based models rank the candidates and suggests the most proper corrections. Training and testing sets are selected from HSK dynamic composition corpus created by Beijing Language and Culture University. Besides the HSK WOE dataset, Google Chinese Web 5- gram corpus is used to learn features for WOEs detection and correction. The best model achieves an accuracy of 0.834 for detecting WOEs in sentence segments. On the average, the correct word orderings are ranked 4.8 among 184.48 candidates. |
URI: | https://scholars.lib.ntu.edu.tw/handle/123456789/413120 | ISBN: | 9781941643266 |
顯示於: | 資訊工程學系 |
在 IR 系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。