中文口語處理技術之前瞻性研究課題( 2/3 )

李琳山

標題:	中文口語處理技術之前瞻性研究課題( 2/3 )
作者:	李琳山
公開日期:	31-七月-2004
出版社:	臺北市：國立臺灣大學電信工程學研究所
摘要:	The Chinese language is not only spoken by the largest population in the world, but quite different from many western languages with a very special structure. It is not alphabetic: large number of Chinese characters are ideographic symbols and pronounced as monosyllables. The open vocabulary nature, the flexible wording structure and the tone behavior are also good examples within the special structure. It is believed that better results and performance will be obtainable in developing Chinese spoken language processing technologies, if this special structure can be taken into account. In this paper, a set of “feature units” for Chinese spoken language processing is identified, and the retrieval, segmentation and summarization of Chinese spoken documents are taken as examples in analyzing the use of such “feature units”. Experimental results indicate that by careful considerations of the special structure and proper choice of the “feature units”, significantly better performance can be achieved. 中文不僅僅是世界上最多人口使用的語言，同時因為語言本身的特殊結構，也和許多西方語言有著非常大的差異。中文不是拼音語言，大量的中文單字本身都是具有意義的符號，而且以單音節的方式發音;開放的詞彙、有彈性的構辭、還有不同的聲調特徵，都是中文特殊的語言結構的一些例子。在發展中文語音處理的技術時，普遍認為，如果能連同中文的特殊結構一起考慮進去，將能夠得到比較好的實驗結果以及效能。本篇論文中，我們定義出一套適合中文語音處理的特徵單位，也藉由中文語音文件的檢索、切割、摘要等例子來對於所提出的這一套特徵單位進行分析。實驗的結果顯示，仔細地考慮中文的特殊結構，再配合上特徵單位的正確選取，將可以達到顯著的效能進步。
URI:	http://ntur.lib.ntu.edu.tw//handle/246246/20283
其他識別:	922213E002035
Rights:	國立臺灣大學電信工程學研究所
顯示於：	電信工程學研究所

文件中的檔案：

檔案	描述	大小	格式
922213E002035.pdf		119.37 kB	Adobe PDF	檢視/開啟

顯示文件完整紀錄

Page view(s)

checked on 2024/4/27

下載

checked on 2024/4/27

Google Scholar^TM

檢查

TAIR相關文章

文件中的檔案：

Page view(s)

下載

Google ScholarTM

Google Scholar^TM