Gene Set Enrichment Analysis of RNA-Seq data
Date Issued
2016
Date
2016
Author(s)
Li, Pei-Hsun
Abstract
During the past few years, RNA-Seq technology has been widely employed for studying the transcriptome since it has clear advantages over the other transcriptomic technologies. The most popular use of RNA-seq applications is to identify differentially expressed genes. In addition, gene set analysis (GSA) aims to determine whether a predefined gene set, in which the genes share a common biological function, is correlated with the pheno-type. To date, many GSA approaches have been developed for identifying differentially expressed gene sets using microarray data. However, these methods are not directly ap-plicable to RNA-seq data due to intrinsic difference between two data structures. When testing the differential expression of gene sets, there is a critical assumption that the mem-bers in each gene set are sampled independently in most GSA methods. It means that the genes within a gene set don’t share a common biological function. In order to resolve this issue, we propose a GSA method based on the De-correlation (DECO) algorithm by Dougu Nam (2010) to remove the correlation bias in the expression of each gene set. We study the performance of our proposed method compared with other GSA methods through simulation studies under various scenarios combining with four different normal-ization methods. As a result, we found that our proposed method outperforms the others in terms of Type I error rate and empirical power.
Subjects
gene set analysis
differentially expressed
DECO
correlation bias
Type
thesis
File(s)
Loading...
Name
ntu-105-R03621207-1.pdf
Size
23.32 KB
Format
Adobe PDF
Checksum
(MD5):00632ecbd6e1712dc73c853ee4d5e8ee