Identification of Differentially Expressed Genes of RNA-Seq Data based on Bayesian Approaches
Date Issued
2015
Date
2015
Author(s)
Zeng, Yu-Shiang
Abstract
With the rapid development of Next Generation Sequencing technology, plenty of industries such as medical science, agriculture and bio-technology are taken to the next level. Next Generation Sequencing technology makes whole genome sequencing and de novo sequencing possible to explore the biology-based theory; besides, RNA-seq data is one of the core applications of Next Generation Sequencing technology. RNA-seq data is to obtain the gene expression level and to test whether specific gene is differentially expressed. Recently, RNA-seq data has replaced Microarray technology and becomes the important benchmark of gene expression test gradually. However, because of the discrete RNA-Seq read counts, the phenomena of over-dispersion (the variance of the data is larger than the mean) will occur. To deal with over-dispersion problem, negative binomial model is applied; however, the parameter estimation is another issue to be considered. Nowadays, some analysis softwares for RNA-seq data like DESeq, edgeR and DSS only use point estimation to obtain the parameters without considering the uncertainty in RNA-seq data. Here, we use Markov chain Monte Carlo (MCMC) method to obtain the estimates of parameters that it may be concerned with detecting the differentially expressed genes. In the end of the thesis, we compare the performance of DESeq, edgeR, DSS and our method by both simulated and real RNA-seq data. Our log-linear model performs much more superior than DESeq, edgeR and DSS while the replicates between groups are close or same. Besides, when the number of replicates between groups is extremely unbalanced, then we suggest that median estimator would be the proper method for detecting differentially expressed genes.
Subjects
RNA-seq
Gene expression
Bayesian inference
Log-linear model
Type
thesis
File(s)![Thumbnail Image]()
Loading...
Name
ntu-104-R02621208-1.pdf
Size
23.32 KB
Format
Adobe PDF
Checksum
(MD5):3d1a53464b4dba731dcd0a63a4cd5049
