Automatic Document Classification Based on Temporal Analysis
Date Issued
2009
Date
2009
Author(s)
Chiang, Wen -Cheng
Abstract
The popular use of the Internet has increased the amount of information which is accessible and stored through the web. Therefore retrieving a great deal of the information efficiently is becoming more and more important. Automatic Documents Classification (ADC) is a common strategy to associate the information with semantically meaningful classes and can improve the efficiency. However traditional ADC doesn’t consider temporal factor when constructing classifier. New information may appear or specific terms may disappear with time. These characteristics would lead into different classification of some documents in different time. We first discuss several temporal issues and design experiments to evaluate the influence of temporal factor on classification. Finally we propose our temporal analysis strategy to explore optimum training set for constructing temporal classifier. With the temporal analysis process, we reduce the amount of data for training classifier and improve the classification performance.
Subjects
temporal analysis
optimum training set
document classification
classifier
Type
thesis
File(s)![Thumbnail Image]()
Loading...
Name
ntu-98-R95922038-1.pdf
Size
23.32 KB
Format
Adobe PDF
Checksum
(MD5):f14a32b1dd95d2fc86c14c11be83860e
