Unsupervised Feature Selection: inimize Information Redundancy of Features
Date Issued
2009
Date
2009
Author(s)
Yen, Chun-Chao
Abstract
In the thesis, we propose an unsupervised feature selection method to remove the redundant features from a dataset. The major contributions are twofold. First, we propose an eigen-decomposition method to rank the hyperplanes (which describes the relations between features) based on their near linear dependency characteristic, and then design an efficient Gaussian-elimination method to one by one remove the feature that is best represented by the rest of the features. Second, we provide a proof showing that our method is similar to removing the features that contribute the most to the PCA components with the smallest eigenvalue, but considering the effect of each removal of features. We perform experiments on an artificial data set created by ourselves, and two other real-world data sets with different characteristics. The experiment show that our method can almost perfectly remove those dependent features without losing any independent dimension in the artificial set and outperforms two other competitive algorithms in the real-world dataset.
Subjects
Unsupervised Feature Selection
Machine Learning
Linear Dependency
Principal Component Analysis
Type
thesis
File(s)![Thumbnail Image]()
Loading...
Name
ntu-98-R96944016-1.pdf
Size
23.32 KB
Format
Adobe PDF
Checksum
(MD5):8eb3bf43e2bebafcc2b7c85292b29de3
