陳光華臺灣大學:圖書資訊學研究所林孚嘉Lin, Fu-chiaFu-chiaLin2010-05-052018-05-302010-05-052018-05-302008U0001-1802200804383100http://ntur.lib.ntu.edu.tw//handle/246246/179655摘要 本研究以NTCIR第五次檢索會議實驗所提供之標準文件集與問題集為實驗測試環境,挑選問題集內容敘述全為中文之檢索問題進行檢索實驗。同時提供受測者三種不同觀念與技術所產生之詞彙輔助資源:傳統式索引典、統計式索引典、知識本體,經相關詞回饋處理後,供受測者選取查詢擴張詞辭彙。研究結果發現以增進檢索效益之題數來看,統計式索引典詞彙輔助資源提升檢索效益的題目數略多於知識本體輔助資源;但以增進檢索效應之幅度來說,則是知識本體詞彙輔助資源最好。本次實驗發現,傳統式索引典詞彙輔助資源提升檢索效益的表現最差。Abstractn this study, the effectiveness of different kinds of vocabulary resources for Chinese information retrieval are examined and compared based on interactions between users and the information retrieval system. We use traditional thesaurus, statistical thesaurus, and ontology to carry out a series of experiments for detailed investigation. The NTCIR5 test collection is used as the benchmark, which is composed of topic set, document set, and answer set. In order to make the study much more targeted, 25 queries with Chinese only are extracted and examined from totally 50 queries in NTCIR5 topic sets. The experimental results show that the statistical thesaurus greatly increases the number of improved queries, but ontology greatly increases the retrieval performance. Traditional thesaurus shows the poorest performance among these vocabulary resources. We also find that the users with good experience in information retrieval do well utilize vocabulary resources, and produce good retrieval results. In addition, all vocabulary resources do help Type-II queries, i.e., queries with simple concepts and non-specific temporal and spacial scope.目次要.......................................................................................................................................................i次.....................................................................................................................................................iii目次..................................................................................................................................................v一章 緒論......................................................................................................................................1 1.1 問題陳述.............................................................................................................................1 1.2 研究目的.............................................................................................................................2 1.3 研究範圍與限制.................................................................................................................3 1.4 研究方法與步驟.................................................................................................................3 1.5 名詞解釋.............................................................................................................................4二章 文獻探討..............................................................................................................................7 2.1 資訊檢索研究舉隅.............................................................................................................7 2.1.1 資訊檢索系統..............................................................................................................7 2.1.2 資訊檢索模型..............................................................................................................8 2.1.3 資訊檢索評估............................................................................................................11 2.2 詞彙資源綜述...................................................................................................................16 2.2.1 傳統索引典................................................................................................................16 2.2.2 統計式索引典............................................................................................................20 2.2.3 知識本體....................................................................................................................22 2.3 查詢詞彙擴張...................................................................................................................24 2.3.1 檢索詞彙之重要性....................................................................................................24 2.3.2 查詢擴張....................................................................................................................29三章 詞彙資源效益之實踐........................................................................................................37 3.1 檢索實驗環境...................................................................................................................37 3.1.1 檢索測試文件與問題集............................................................................................37 3.1.2 輔助使用者之檢索詞彙資源....................................................................................37 3.1.3 檢索核心....................................................................................................................38 3.1.4 詞彙輔助資源查詢程式............................................................................................38 3.2 查詢問題設計與選取.......................................................................................................42 3.3 檢索實驗流程...................................................................................................................42 3.3.1 徵求受測者................................................................................................................42 3.3.2 進行檢索實驗............................................................................................................43 3.3.3 實驗結果分析............................................................................................................43 3.4 實驗結果評估...................................................................................................................44四章 實驗結果與分析................................................................................................................45 4.1 三位受測者輸入之檢索詞彙統計...................................................................................45 4.1.1 受測者A輸入之各類檢索詞彙統計.......................................................................45 4.1.2 受測者B輸入各類檢索詞彙統計...........................................................................46 4.1.3 受測者C輸入各類檢索詞彙統計...........................................................................47 4.2 三位受測者檢索結果......................................................................................................48 4.2.1 詞彙輔助資源對檢索產生之影響.........................................................................109 4.2.2 受測者使用詞彙輔助資源後提升檢索效益.........................................................109 4.2.3 受測者使用詞彙輔助資源後部分查詢效益與原始查詢相等.............................110 4.2.4 受測者使用詞彙輔助資源後與原始查詢完全相等.............................................110 4.2.5 受測者B原始查詢與使用詞彙輔助資源後無傳回結果....................................111 4.2.6 受測者C使用詞彙輔助資源成效落後原始查詢................................................111 4.3 詞彙輔助資源效益綜合分析........................................................................................112 4.3.1 相關詞回傳題目總數比.........................................................................................112 4.3.2 選用詞彙輔助資源相關詞彙之檢索效益.............................................................115 4.3.3 不同檢索詞彙輔助資源檢索成效.........................................................................117五章 結論與建議.....................................................................................................................137 5.1 結論................................................................................................................................137 5.1.1 檢索者屬性對檢索成效之影響.............................................................................137 5.1.2 詞彙輔助資源對檢索成效之影響.........................................................................137 5.2 建議................................................................................................................................139 5.2.1 進行更多類似環境之後續檢索實驗研究.............................................................139 5.2.2 開發更符合人類語言習慣之詞彙輔助資源.........................................................140 5.2.3 開發更好的相關詞呈現方式.................................................................................140考文獻.........................................................................................................................................141錄.................................................................................................................................................149 附表1 本次實驗檢索問題與受測者檢索詞彙列表..........................................................149 附表2 受測者A之檢索成效.............................................................................................175 附表3 受測者B之檢索成效.............................................................................................176 附表4 受測者C之檢索成效.............................................................................................177application/pdf846995 bytesapplication/pdfen-US資訊檢索查詢詞彙擴張詞彙輔助資源知識本體Chinese Information RetrievalQuery ExpansionVocabulary ResourcesOntology中文資訊檢索之詞彙資源效益Effectiveness of Vocabulary Resources in Chinese Information Retrievalthesishttp://ntur.lib.ntu.edu.tw/bitstream/246246/179655/1/ntu-97-R90126012-1.pdf