簡立峯臺灣大學:資訊管理學研究所蕭忠立Hsiao, Chung-LiChung-LiHsiao2007-11-262018-06-292007-11-262018-06-292006http://ntur.lib.ntu.edu.tw//handle/246246/54266在全球資訊網上已有許多資源每天持續產生新的術語。以術語為基礎的資訊檢索方法一向是主流。組織術語成為呈現資訊的結構,如術語圖,可幫助許多資訊檢索的應用,像是問答系統、自動摘要系統等。然而,想組織全球資訊網上每天新增的術語,面臨兩個困難:一個是這些資源不提供術語的前後文資訊,不像傳統的關係抽取方法以文件為分析基礎;另一個是術語不斷新增,有各種可能的關係類別,組織過程中不容易全都先具體定義。 本研究提出基於實例的網路探勘方法解決這兩個問題。先由使用者就有興趣的術語關係給出實例,再利用搜尋引擎取得各個術語的資訊特徵,藉由比對與實例之間的相似 度,找出其他具有相同關係的相關術語對。最後從具相同關係的術語對群中,找出共同特徵,做為該術語對群的關係標籤。實驗測試本方法的正確性表現,並討論本方法在實例的選擇,實例的數量,與數種術語關係類別的表現差異。There are lots of terminological resources on the web and continually increasing day by day. Term-based approaches are major information retrieval methods. Organizing terms into a well-formed information structure such as term graph is helpful for advanced IR applications, such as question answering and summarization. However, there are two problems to construct the useful term graph from the increasing terminological resources. One is that no context information can be used from terminological resources as in document-based approach of relation extraction. Another is that no explicitly specific relation types are predefined. To solve the problems, we proposed an example-based Web mining approach to discover term relations from a term set. We identify relations by organizing Related Term Pairs (RTPs) according to similarity of their relations with user-given RTP example. We utilize a Web-mining approach to estimate similarity by the context words occurred in the search results of querying the RTP. We test our approach in a simulate term set. The experiment examine performance of several relation types, and the influence of example selection and example amount.致謝 I 論文摘要 II THESIS ABSTRACT III LIST OF CONTENTS IV LIST OF TABLES VII LIST OF FIGURES VIII CHAPTER 1. INTRODUCTION 1 1.1 BACKGROUND 1 1.1.1 Terminological Resources 1 1.1.2 Term and Relation Organizing 2 1.2 MOTIVATION 3 1.3 GOAL 4 1.4 THESIS ORGANIZATION 5 CHAPTER 2. RELATED WORK 6 2.1 RELATION EXTRACTION 6 2.2 WEB IE 8 2.3 RELATION DISCOVERY 9 2.4 COMPARISON 11 CHAPTER 3. PROBLEM AND IDEA 12 3.1 PROBLEM 12 3.2 IDEA 12 CHAPTER 4. APPROACH 16 4.1 USING THE WEB AS FEATURE SOURCE 16 4.2 RTP MAPPING 17 4.2.1 VSM method 17 4.2.2 Co-occur method 19 4.3 LABELING RELATIONS 19 CHAPTER 5. EXPERIMENTS AND DISCUSSIONS 20 5.1 FINDING RELATED TERMS 20 5.1.1 Data 21 5.1.2 Evaluation 21 5.1.3 Result 22 5.2 INFLUENCE OF EXAMPLES 23 5.2.1 Data 24 5.2.2 Evaluation 24 5.2.3 Result 24 5.3 AMOUNT OF EXAMPLES 27 5.3.1 Data 27 5.3.2 Evaluation 27 5.3.3 Result 27 5.4 SYSTEM PERFORMANCE 29 5.4.1 Data 29 5.4.2 Evaluation 30 5.4.3 Result 31 5.5 LABELING RELATION 35 CHAPTER 6. CONCLUSION AND FUTURE WORK 38 6.1 CONCLUSION 38 6.2 FUTURE WORK 38 REFERENCE 40 APPENDIX I. TESTING DATA 42 APPENDIX II. RTP MAPPING SYSTEM SNAPSHOT 43 簡歷 443108001 bytesapplication/pdfen-US術語關係術語圖術語組織網路探勘term relationterm graphterm organizingWeb mining透過基於實例之網路探勘方法進行術語關係發掘Discovering Term Relations Through An Example-based Web Mining Approachotherhttp://ntur.lib.ntu.edu.tw/bitstream/246246/54266/1/ntu-95-R93725042-1.pdf