鄭卜壬Cheng, Pu-Jen臺灣大學:資訊工程學研究所林佑霖Lin, You-LinYou-LinLin2010-05-172018-07-052010-05-172018-07-052009U0001-2807200917301100http://ntur.lib.ntu.edu.tw//handle/246246/183391集合型搜尋旨在探討如何將多個搜尋系統的結果當成既有的資訊,利用這些資訊,合併產生一個較好的結果。在這篇論文當中,我們藉由機器學習的技術提出了一個新的兩階段排序演算法用來解決如何合併多的搜尋結果的問題。兩階段的排序方法是基於分類的概念,希望對於所有的文件都能先進行分類的動作,利用這些結果進行排序而產生最後的答案。在第一個階段,我們對所有的文件分成四種相關性程度。一但每個文件有了這些資訊,在第二階段我們利用線性組合的方法可以簡單的對這些文件近一步排序,進而得到最後的結果。在實驗的方面,我們將方法實作在NTCIR4英英文件的標準測試集上面,在我們的實驗中,兩階段排序方法的結果皆能夠顯著的勝過數個基準的方法所產生之結果,也證明我們的演算法是有效的。Metasearch is the problem that discusses how to combine the results of multiple independent search algorithms into one single result list and tries to improve the effectiveness of the retrieval. We propose a novel 2-stage ranking method to do this by applying the technology of machine learning. The 2-stage ranking method aims to use the concept of classification to solve the metasearch problem. In the first stage, we try to label each document in the search result with relevance or irrelevance by classification, where we discuss the differences between general classification and cost-sensitive classification in our algorithm. Once we have labeled all of the documents in the search result, in stage 2, we can use this information to produce the final ranking result by using linear combination. The 2-stage ranking method performs well on NTCIR4 English-English IR data. The experiment result shows that our method outperforms the existed metasearch algorithms and gives a significant improvement.摘要 iibstract iiicknowledgement ivable of Contents vist of Figures viiist of Tables viiihapter 1: Introduction 1.1 Motivations 1.2 Problem Specification 3.3 Basic Idea 5.4 Proposed Approaches 8.5 Thesis Organization 8hapter 2: Related Work 9hapter 3: Methodology 14.1 Framework 14.2 Settings 18.3 Stage 1: Classification 18.3.1 Feature Extraction 19.3.2 Classifier 21.3.3 Cost-Sensitive Classification 29.4 Stage 2: Ranking with Classes 33hapter 4: Experiment 36.1 Data Set 36.2 Environment 37.3 Measurement 37.4 Baseline Models 38.5 Experiment Results and Discussion 39.5.1 Exp 1: Methods Evaluation 39.5.2 Exp 2: Effect of Input Result’s Size 41.5.3 Exp 3: Feature Analysis 42hapter 5: Conclusion 45.1 Summary of Contributions 45.2 Future Work 46ibliography 47application/pdf639043 bytesapplication/pdfen-US集合型搜尋metaseachlearning to ranksearch result merging一種合併多個搜尋結果之兩階段排序方法A 2-Stage Ranking Method to Merge Multiple Search Resultsthesishttp://ntur.lib.ntu.edu.tw/bitstream/246246/183391/1/ntu-98-R96922124-1.pdf