Two Methods in Association Analysis: (1) Likelihood Ratio Test with Clustered Haplotypes (2) Kernel Canonical Correlation Analysis
Date Issued
2008
Date
2008
Author(s)
Lee, Mei-Hsien
Abstract
Association analysis is a common method in statistical analysis. For instance, to investigate the association between diseases and genetic markers, scientists conduct association studies to detect the liability loci. This kind of studies is called association studies. There are basically two different study designs, the population-based case-control studies and the family-based association studies. Researches usually focus on a specific study design and then develop methodology for analysis. Current statistical analysis can be categorized roughly to nonparametric and parametric methods. Difficulties arise, however, when some haplotypes are with small frequencies, when degree of freedom in the association test is large, and when the size of data is enormous. In the first part of this thesis, we will adopt the parametric likelihood approach, use the evolutionary clustering tool for minor haplotypes, reduce the dimensionality corresponding to the number of haplotypes, and take into account the uncertainty in the transmission phase. Simulation studies and comparison with Famhap and FBAT show that the likelihood ratio test with clustered haplotypes outperforms.he second part of this thesis tackles the association test from the perspective of statistical learning theory. The emphasis of this part is more on the bioinformatics viewpoint. To measure the association between two sets of random variables, Hotelling (1936) proposed the classical linear canonical correlation analysis (LCCA). However, its application is limited to linear association and normality assumption. We introduce a nonparametric kernel canonical correlation analysis (KCCA) for nonlinear association measures between two sets of variables and propose a new independence test under KCCA. The KCCA can be applied directly on genotype data, and avoid the inference of haplotype phase and estimation of haplotype frequencies. Implementation issues are discussed and numerical experiments with other nonparametric methods are presented.
Subjects
association study
bioinformatics
clustering
evolution
haplotype
haplotype ambiguous
kernel canonical correlation
likelihood function
statistical learning
Type
thesis
File(s)![Thumbnail Image]()
Loading...
Name
ntu-97-D92842008-1.pdf
Size
23.32 KB
Format
Adobe PDF
Checksum
(MD5):2b2223cfbcd9adcad149609c3cdda872
