Hsu C.-L.WEN-CHUNG LEE2020-11-192020-11-1920100300-5771https://www.scopus.com/inward/record.uri?eid=2-s2.0-78649790236&doi=10.1093%2fije%2fdyq093&partnerID=40&md5=705a0f3d702707a080d58a5b9936edb6https://scholars.lib.ntu.edu.tw/handle/123456789/521783Background: Microarray technology provides information about hundreds and thousands of gene-expression data in a single experiment. To search for disease-related genes, researchers test for those genes that are differentially expressed between the case subjects and the control subjects. Methods: The authors propose a new test, the 'half Student's t-test', specifically for detecting differentially expressed genes in heterogeneous diseases. Monte-Carlo simulation shows that the test maintains the nominal α level quite well for both normal and non-normal distributions. Power of the half Student's t is higher than that of the conventional 'pooled' Student's t when there is heterogeneity in the disease under study. The power gain by using the half Student's t can reach ~10% when the standard deviation of the case group is 50% larger than that of the control group. Results: Application to a colon cancer data reveals that when the false discovery rate (FDR) is controlled at 0.05, the half Student's t can detect 344 differentially expressed genes, whereas the pooled Student's t can detect only 65 genes. Or alternatively, if only 50 genes are to be selected, the FDR for the pooled Student's t has to be set at 0.0320 (false positive rate of ~3%), but for the half Student's t, it can be at as low as 0.0001 (false positive rate of about one per ten thousands). Conclusions: The half Student's t-test is to be recommended for the detection of differentially expressed genes in heterogeneous diseases. Published by Oxford University Press on behalf of the International Epidemiological Association ? The Author 2010; all rights reserved.English[SDGs]SDG3cancer; detection method; disease treatment; epidemiology; gene expression; heterogeneity; Monte Carlo analysis; article; colon cancer; controlled study; gene expression; gene identification; genetic association; Monte Carlo method; priority journal; Student t test; Colonic Neoplasms; Computer Simulation; Gene Expression; Humans; Models, Statistical; Monte Carlo Method; Statistics, NonparametricDetecting differentially expressed genes in heterogeneous diseases using half Student's t-testjournal article10.1093/ije/dyq093205193352-s2.0-78649790236