Clustering Analysis by Attributes Interrelations and its Application to Clustering of  Differentially Expressed Genes

Lin, Chen-Sui

Clustering Analysis by Attributes Interrelations and its Application to Clustering of Differentially Expressed Genes

Date Issued

2005

Date

2005

Author(s)

Lin, Chen-Sui

DOI

en-US

URI

http://ntur.lib.ntu.edu.tw//handle/246246/51163

Abstract

The unsupervised classification methods, Clustering analysis and Factor analysis, intend to find meaningful structures existing in the observed attributes. These structures are usually expressed by grouping of attributes based on the similarities, or relationships among the attributes. However, the disadvantage of Factor analysis lies on insufficiency of full-rank in numerical computation. For example, in microarray data analysis, expressions of 10,000~20,000 genes are collected for each array. The number of genes is usually far larger than number of microarray. Clustering analysis, on the other hand, can help handle with a vast amount of attributes with few samples. There are some drawbacks of Clustering analysis, including of misapplying the correlation coefficient and the difficulties of evaluating the cluster quality as well as the determination of the cluster number. In this research, we first discuss characterization of interrelationships among attributes, and then develop clustering methods suitable for grouping interrelated attributes. The “R2 with PCA” method lays more stress on the linear relationships between two clusters, while the “Variance explanation” method focuses not only on interrelations among attributes but also on attributes variations. This research also proposes the statistics for the evaluation of the cluster quality, and these statistics take into considerations the interrelationships among clusters and the variances explained of clusters. Finally, we apply these novel methods to two cases; one is 19 blood tests of 24 human; and the other is Down syndrome microarray data.

Subjects

群集分析

相關性之相異度

分群結果品質

群組個數決定

Clustering analysis

Dissimilarity using Correlation

Cluster quality

Cluster number determination

Type

thesis

File(s)

Name

ntu-94-R92546020-1.pdf

Size

23.53 KB

Format

Adobe PDF

Checksum

(MD5):f9ab59263eda7d886a6f575b9282e89b

Clustering Analysis by Attributes Interrelations and its Application to Clustering of Differentially Expressed Genes

關於 (About)

聯絡資訊 (Contact Us)

相關網站 (Useful Links)

關於開放取用 (Open Access, OA)

出版社期刊論文授權政策 (Copyright)

使用說明 (Instructions)

登入說明 (Sign-in)

匯入著作 (Submission)