https://scholars.lib.ntu.edu.tw/handle/123456789/413154
標題: | A comparison between microblog corpus and balanced corpus from linguistic and sentimental perspectives | 作者: | Tang Y.-J. Li C.-Y. Chen H.-H. |
公開日期: | 2011 | 卷: | WS-11-05 | 起(迄)頁: | 68-73 | 來源出版物: | AAAI Workshop | 摘要: | While microblogging has gained popularity on the Internet, analyzing and processing short messages has become a challenging task in natural language processing. This paper analyzes the differences between Internet short messages (or "microtext") and general articles by comparing the Plurk Corpus and the Sinica Balanced Corpus. Likelihood ratio and the t?ngy? c?c?l?n thesaurus are adopted to analyze the lexical semantics of frequent terms in each corpus. Furthermore, the NTUSD sentiment dictionary is used to compare the sentiment distribution of the two corpora. The result is also applied to sentiment transition analysis. |
URI: | https://scholars.lib.ntu.edu.tw/handle/123456789/413154 | ISBN: | 9781577355212 |
顯示於: | 資訊工程學系 |
在 IR 系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。