Support Vector Machines: Classification with Coding and Regression for Gene Selection

Chen, Pei-Chun

Support Vector Machines: Classification with Coding and Regression for Gene Selection

Date Issued

2008

Date

2008

Author(s)

Chen, Pei-Chun

URI

http://ntur.lib.ntu.edu.tw//handle/246246/180663

Abstract

This thesis contains two major themes. One is the multiclass support vector machines and the other is the support vector regression for gene selection. In the first part, we propose a regression approach for multiclass support vector classification. We introduce some existing coding schemes into the support vector classification by coding the class labels into multivariate responses. Regression of these multivariate responses on kernelized input data is used to extract a low-dimensional featureubspace for discriminant purpose. We unify these coding schemes by showing that they are equivalent in the sense of leading to the same low-dimensional discriminant feature subspace. Classification is then carried out in this low-dimensional subspace using a linear discriminant algorithm, which can be any reasonable choice. The regression approach for extracting low-dimensionaliscriminant subspace combined with user-specified linearlgorithm can team up into a simple but yet powerful toolkit for multiclass support vector classification. Issues of encoding, decoding and the notions of equivalence of codes are discussed. Experimental results, including prediction ability and CPU time, show that our approach is a competent alternative for the multiclass support vector machine problem.n the second part, we propose a support vector regressionpproach for gene selection and use the selected genes for disease classification. Current gene selection methods based on microarray data have treated each individual subject with equal weight to the disease of interest. However, tissues collected from different patients can be from different disease stages and may have different strength of association with the disease. To reflecthis circumstance, our proposed method will take into account the subject variation by assigning different weights to subjects. The weights are calculated via support vector regression. Then significant genes are selected based on the cumulative sum of weighted expressions. The proposed gene selection procedure isllustrated and evaluated using the acute leukemia and colon cancer data. The results and performance are compared with four other approaches in terms of classification accuracies.

Subjects

coding

gene selection

kernel

linear discriminant subspace

machine learning

microarray data analysis

support vector machine

support vector regression

SDGs

[SDGs]SDG3

Type

thesis

File(s)

Name

ntu-97-D93842005-1.pdf

Size

23.32 KB

Format

Adobe PDF

Checksum

(MD5):80957e6372b3373fd4eaa7746c39ff16

Support Vector Machines: Classification with Coding and Regression for Gene Selection

關於 (About)

聯絡資訊 (Contact Us)

相關網站 (Useful Links)

關於開放取用 (Open Access, OA)

出版社期刊論文授權政策 (Copyright)

使用說明 (Instructions)

登入說明 (Sign-in)

匯入著作 (Submission)