The Impact of Feature Normalization on Different Feature Types of Medical Datasets

To obtain quality data mining results, data pre-processing is usually performed in the knowledge discovery in databases (KDD) process. Particularly, feature normalization or scaling is one important step in data pre-processing. This is because many datasets usually contain some features that have broad ranges of values, and feature normalization is applied to normalize or rescale each feature value to a fixed range, usually between 0 and 1. For the medical domain datasets, they usually contain three different kinds of data including categorical, numerical, and the mixed data type, this paper examines the effect of performing feature normalization on the three different types of medical datasets. Our experimental results indicate that for the categorical and some mixed types of datasets performing feature normalization does not necessarily make the k-NN and SVM classifiers perform better than the ones without feature normalization. On the other hand, for the numerical type of datasets k-NN and SVM by feature normalization perform better than the baseline classifiers.

Subjects

data preprocessing | feature normalization | medical datasets | pattern classification

Type

conference paper

The Impact of Feature Normalization on Different Feature Types of Medical Datasets

關於 (About)

聯絡資訊 (Contact Us)

相關網站 (Useful Links)

關於開放取用 (Open Access, OA)

出版社期刊論文授權政策 (Copyright)

使用說明 (Instructions)

登入說明 (Sign-in)

匯入著作 (Submission)