Automatic Identification for Topics of Electronic Documents
Resource
中國圖書館學會會報,59,43-58
Journal
中國圖書館學會會報
Journal Issue
59
Pages
43-58
Date Issued
1997
Date
1997
Author(s)
Abstract
The volume of electronic documents in the Internet grows very quickly. How to effectively assign topics to documents becomes an important issue. In the present time, the researches based on this line focus on the behavior of nouns in documents. Although topics are composed of nouns, the constituents that determine which nouns are topics are not only nouns. We think that texts are well-organized and are event-driven. Therefore, nouns and verbs together contribute the process of topic identification. This paper considers four factors: 1) word importance, 2) word frequency, 3) word co-occurrence, and 4) word distance and constructs a mathematical model. The preliminary experiments show that the performance of the proposed model is equivalent to that of human being
Subjects
Information Retrieval
Electronic Document
Topic Identification
Publisher
臺北市:國立臺灣大學圖書資訊學系
Type
journal article
File(s)![Thumbnail Image]()
Loading...
Name
blac1997.pdf
Size
828.35 KB
Format
Adobe PDF
Checksum
(MD5):6792643b5c2c91ed119889dd9245c8ea
