智慧型知識擷取技術與應用研究─子計畫一：語料庫之設計與製作--語料庫與資訊檢索標竿測試集設計之研究(III)

陳光華

標題:	智慧型知識擷取技術與應用研究─子計畫一：語料庫之設計與製作--語料庫與資訊檢索標竿測試集設計之研究(III)
作者:	陳光華
關鍵字:	標竿測試集;資訊檢索;相關判斷;查詢主題;Benchmark;Information Retrieval;Relevance Judgment;Topic
公開日期:	31-七月-1999
出版社:	臺北市：國立臺灣大學圖書資訊學系暨研究所
摘要:	在國內資訊檢索研究已日趨受到重視，合適的測試評估機制卻十分缺乏的背景下，本研究實際進行測試集的規劃與建置工作。測試集建構工作主要包括蒐集整理文件、建立查詢主題、以及進行相關判斷三個部分。本研究建立的文件集來源為新聞網站中的五種電子報，共有132,207 篇文件。查詢主題是透過網路問卷實際徵集查詢需求，並進行三次的篩選之後，修正建構而成，共完成50 個查詢主題。相關判斷的部分則是先對每個查詢主題建立一相關文件候選集，再針對候選集中的每篇文件以人工進行相關判斷，每一查詢主題由三位次判斷者同時進行，最後，則依據判斷結果計算並定義文件的相關程度。經由研究結果的分析顯示，本測試集有完整的架構及一定的規模，未來的研究應可以此為基礎，作進一步的擴展與改進 The research and development of information retrieval (IR) has made much progress recently. However, there’s not any applicable mechanism for system evaluation in the Chinese research society. This project aims at the design and the implementation for Chinese information retrieval benchmark. Generally speaking, a benchmark consists of a set of documents, a set of topics, and a set of relevance between documents and topics. Accordingly, our task is also separated into three parts. The document set is downloaded from various electronic news sites, and totally 132,207 documents are collected. To build the topics, we investigate the real user information needs by using a questionnaire, and then modify them to be the formal topics. As to relevance judgment, we first set up a pool of candidate documents for each topic, and then invite three persons to judge the relevance. Finally, we combine the judgments and offer a relevance measure for each document in the pool. The result of our research shows that the benchmark possesses a complete structure and medium scale, and we may further expand and improve it based on existing framework in the future.
URI:	http://ntur.lib.ntu.edu.tw//handle/246246/20395
其他識別:	882213E002035
Rights:	國立臺灣大學圖書資訊學系暨研究所
顯示於：	圖書資訊學系

文件中的檔案：

檔案	描述	大小	格式
882213E002035.pdf		93.95 kB	Adobe PDF	檢視/開啟

顯示文件完整紀錄

Page view(s) 5

152

checked on 2024/4/20

下載

checked on 2024/4/20

Google Scholar^TM

檢查

TAIR相關文章

文件中的檔案：

Page view(s) 5

下載

Google ScholarTM

Google Scholar^TM