Affective Lexicon in Chinese – Construction and Annotation
Date Issued
2015
Date
2015
Author(s)
Lu, Pei-Yu
Abstract
Affective lexicon is the fundamental resource for sentiment detection. However, most existing Chinese affective lexicon is mainly about affect-denoting words and lacks of affect-signaling words. From the aspect of cognitive semantics and pragmatics, affect-signaling words play a critical role in emotion expression of language use. Semantic prosody explains neutral words would have association with positive or negative polarity, while the functional theory shows the connection between words and meaning is not one-on-one, neither is the connection between words and emotion. The corresponding of emotion and language expression might beyond the boundaries of words: chunks. Therefore, the research aims to collect annotate affect-signaling words and organize it with affect-denoting words into a multi-dimensional affective lexicon in Chinese. The function of the result is not only for the open resource for sentiment analysis, but also as an evidence of how functional grammar works in sentiment detection in texts. Two phases of process involve in the research. First is manual collection, annotation, and categorization of affective lexicon. Second is the evaluation and application. In first stage, affect-denoting words are categorized into 5 categories (happy, sad, scared, angry, and surprised) and 3 levels (emotion, mood, temperament), according to the strength and duration. On the other hand, affect-signaling words are collected and annotated from two sources of database: author-oriented emotional articles (from BBS) and reader-oriented emotional news (from yahoo news). Besides, the common emotion expression words are collected as well, including interjections, emoticons, and expletives. In phase two, the emotion-prediction ability of each affect-signal words is calculated by the mean scores of emotion value in the following ten words. To measure the result, the random sample of affect-signaling words are added in the NTUSD as the affective lexicon for sentiment analysis to compare the accuracy with/without affect-signaling words. The promotion of the accuracy in positive affect-signaling words is 4.78% while the negative one is 18.18%. In the application, the whole affective lexicon is applied on an unsupervised machine leaning approach to sentiment detection of micro-blog data in Chinese (Magistry et al, 2015), and yields the promising result of nearly 2% improvement in the original F1-score.
Subjects
emotion denoting words
emotion signaling words
emotion words
semantic prosody
chunk
Type
thesis
File(s)
Loading...
Name
ntu-104-R01142003-1.pdf
Size
23.54 KB
Format
Adobe PDF
Checksum
(MD5):34a19a1440e68017bc2cf65bdceefdbf