Interpretation of Chinese Discourse Markers, Discourse Relation Recognition, and their Relationships with Sentiment Polarity

Huang, Hen-Hsen

Interpretation of Chinese Discourse Markers, Discourse Relation Recognition, and their Relationships with Sentiment Polarity

Date Issued

2014

Date

2014

Author(s)

Huang, Hen-Hsen

URI

http://ntur.lib.ntu.edu.tw//handle/246246/261422

Abstract

Discourse relation is the rhetorical relation between two discourse units (i.e. clauses, sentences, or blocks of sentences). The famous discourse relations include Temporal, Contingency, Comparison, Expansion, and so on. A discourse relation indicates how its two discourse units cohere, and this information influences the meaning of text. Discourse relation is important clue to many applications such as summarization, opinion mining, textual entailment, and event recognition. Recently the research on automatically English discourse relation recognition is rapid growth due to the release of corpora like Rhetoric Structure Theory Discourse Treebank (RST-DT) and Penn Discourse Treebank (PDTB). Unlike English, Chinese discourse relation recognition is more challenging because of the lack of resources and the special issues in Chinese. In this dissertation, we give an in-depth study on Chinese discourse relation analysis. We propose a statistical algorithm to recognize the discourse relation in both levels of inter-sentential and intra-sentential. We also show our preliminary results on Chinese discourse parsing at sentence level. In Chinese, many long sentences contain more than two clauses and form complex discourse structures. Discourse parsing fetches the hierarchical structure and relation among the clauses in a given sentence. Discourse markers are key clue to discourse process, but the use of Chinese discourse marker is inherent ambiguity. To interpret the ambiguous Chinese discourse markers, we propose a semi-supervised framework to estimate the distribution of each Chinese discourse marker from a large-sized corpus, the ClueWeb09. This semi-supervised framework with the estimated distributions finally improve the performance of Chinese discourse relation recognition. Discourse relations and sentiment polarities are interactive in text. We investigate their correlation with ClueWeb09. A moderate-sized data annotated by human are analyzed and compared with the huge data heuristically labeled by machine. As a result, the association between sentiment and discourse is validated. In this dissertation, we focus on the four-way discourse relation classification. We will investigate the finer-grained classification on discourse relations in the future. In addition, we will further tackle the issue of Chinese discourse parsing at paragraph level and document level.

Subjects

自然語言處理

中文語篇分析

語篇關係辨識

語篇標記

意見極性

Type

thesis

File(s)

Name

ntu-103-D97922036-1.pdf

Size

23.32 KB

Format

Adobe PDF

Checksum

(MD5):ce0f8cf00700a96e63ee0eed9ee5c5af

Interpretation of Chinese Discourse Markers, Discourse Relation Recognition, and their Relationships with Sentiment Polarity

關於 (About)

聯絡資訊 (Contact Us)

相關網站 (Useful Links)

關於開放取用 (Open Access, OA)

出版社期刊論文授權政策 (Copyright)

使用說明 (Instructions)

登入說明 (Sign-in)

匯入著作 (Submission)