Detecting differentially expressed genes in heterogeneous diseases using half Student's t-test
Journal
International Journal of Epidemiology
Journal Volume
39
Journal Issue
6
Pages
1597-1604
Date Issued
2010
Author(s)
Hsu C.-L.
Abstract
Background: Microarray technology provides information about hundreds and thousands of gene-expression data in a single experiment. To search for disease-related genes, researchers test for those genes that are differentially expressed between the case subjects and the control subjects. Methods: The authors propose a new test, the 'half Student's t-test', specifically for detecting differentially expressed genes in heterogeneous diseases. Monte-Carlo simulation shows that the test maintains the nominal α level quite well for both normal and non-normal distributions. Power of the half Student's t is higher than that of the conventional 'pooled' Student's t when there is heterogeneity in the disease under study. The power gain by using the half Student's t can reach ~10% when the standard deviation of the case group is 50% larger than that of the control group. Results: Application to a colon cancer data reveals that when the false discovery rate (FDR) is controlled at 0.05, the half Student's t can detect 344 differentially expressed genes, whereas the pooled Student's t can detect only 65 genes. Or alternatively, if only 50 genes are to be selected, the FDR for the pooled Student's t has to be set at 0.0320 (false positive rate of ~3%), but for the half Student's t, it can be at as low as 0.0001 (false positive rate of about one per ten thousands). Conclusions: The half Student's t-test is to be recommended for the detection of differentially expressed genes in heterogeneous diseases. Published by Oxford University Press on behalf of the International Epidemiological Association ? The Author 2010; all rights reserved.
SDGs
Other Subjects
cancer; detection method; disease treatment; epidemiology; gene expression; heterogeneity; Monte Carlo analysis; article; colon cancer; controlled study; gene expression; gene identification; genetic association; Monte Carlo method; priority journal; Student t test; Colonic Neoplasms; Computer Simulation; Gene Expression; Humans; Models, Statistical; Monte Carlo Method; Statistics, Nonparametric
Type
journal article