資訊的組織與擷取

陳光華2006-08-232018-05-302006-08-232018-05-301997http://ntur.lib.ntu.edu.tw//handle/246246/29214網際網路的發展使得資訊檢索的研究進入更具挑戰性的環境，然而資訊檢索系統通常僅僅告訴使用者有哪些相關的文件，而非真正提供使用者所需要的資訊。資訊擷取的研究則是進一步分析文件，依據預先定義的樣版取出特定的資訊。參照於圖書館以機讀編目格式描述藏品，資訊擷取系統所稱的樣版與機讀編目格式都可視為一種元資料格式，亦即是用於描述資料的資料。本文說明元資料與資訊擷取的關係，並討論如何藉由自然語言處理的語言分析技術有效協助使用者擷取所需要的資訊。The development of Internet makes the researches on information retrieval more changeable. Actually, the so-called “information retrieval” is “text retrieval.” It is necessary for users to find out the needed information from the retrieved texts. A higher-level task is information extraction, which extracts the information based on pre-defined templates. From the viewpoint of Library Science, these pre-defined templates are the metadata, which describes the collection of libraries in common. This paper discusses the relationships between metadata and information extraction and how natural language processing helps the task of information extraction.application/pdf320578 bytesapplication/pdfzh-TW國立臺灣大學圖書資訊學系資訊檢索資訊擷取元資料Information RetrievalInformation ExtractionMetadata資訊的組織與擷取Organization and Extraction for Informationjournal article10.6182/jls.1997.12.127http://ntur.lib.ntu.edu.tw/bitstream/246246/29214/1/jlis199702.pdf