Improved Algorithms for the Selection of Tag SNPs
Date Issued
2005
Date
2005
Author(s)
Chang, Chia-Jung
DOI
en-US
Abstract
Recent studies have shown that the patterns of Linkage Disequilibrium (LD) observed
in the human population reveal a block-like structure. The entire chromosome can be
partitioned into high LD regions interspersed by low LD regions. The high LD regions
are usually called “haplotype blocks”. Within a haplotype block, there are only few
haplotype patterns and only a small subset of SNPs (called tag SNPs) are sufficient to
distinguish these patterns.
To solve the problem of finding tag SNPs, we propose a hybrid method that combines
the ideas of the branch-and-bound method and the greedy algorithm. This method
explores larger solution space to obtain a better solution than a traditional greedy algorithm.
It also allows the user to adjust the efficiency of the program and quality of
solutions. This algorithm has been implemented and tested on a variety of simulated
and biological data. The experimental results indicate that our program can find better
solutions than previous methods. This approach is quite general since it can be used to
adapt other greedy algorithms to solving their corresponding problems.
In addition, we can reduce the number of tag SNPs even more by considering the
extent of linkage disequilibrium in the human genome. We show that the extent of LD
can be also used to boost the heavy computation of computation of pairwise LD by giving
a faster algorithm. We propose two methods of which the first is a posterior approach
based one existing algorithms and the second identifies tag SNPs by considering the
correlation between SNPs from the beginning. The experimental results show that our
methods can reduce the number of tags SNPs in comparison with previous methods and the efficiency is significantly improved.
Subjects
單核苷
酸多型性
連鎖不平衡
單體型
Single Nucleotide Polymorphism
Linkage Disequilibrium
Haplotype
Type
thesis
