Liu, Hsi-CheHsi-CheLiuCHIEN-YU CHENLiu, Yu-TingYu-TingLiuChu, Cheng-BangCheng-BangChuLiang, Der-CherngDer-CherngLiangShih, Lee-YungLee-YungShihCHIH-JEN LIN2018-09-102018-09-10200815320464http://www.scopus.com/inward/record.url?eid=2-s2.0-46649094432&partnerID=MN8TOARShttp://scholars.lib.ntu.edu.tw/handle/123456789/340482https://www.scopus.com/inward/record.uri?eid=2-s2.0-46649094432&doi=10.1016%2fj.jbi.2007.11.005&partnerID=40&md5=44ace2043492ea731c6602b28373d52dPast experiments of the popular Affymetrix (Affy) microarrays have accumulated a huge amount of public data sets. To apply them for more wide studies, the comparability across generations and experimental environments is an important research topic. This paper particularly investigates the issue of cross-generation/laboratory predictions. That is, whether models built upon data of one generation (laboratory) can differentiate data of another. We consider eight public sets of three cancers. They are from different laboratories and are across various generations of Affy human microarrays. Each cancer has certain subtypes, and we investigate if a model trained from one set correctly differentiates another. We propose a simple rank-based approach to make data from different sources more comparable. Results show that it leads to higher prediction accuracy than using expression values. We further investigate normalization issues in preparing training/testing data. In addition, we discuss some pitfalls in evaluating cross-generation/laboratory predictions. To use data from various sources one must be cautious on some important but easily neglected steps. ? 2007 Elsevier Inc. All rights reserved.application/pdf173516 bytesapplication/pdfAffymetrix microarrays; Cross-generation/laboratory prediction; Rank-based normalization[SDGs]SDG3Forecasting; Transients; Affymetrix (CO); Affymetrix microarrays; Elsevier (CO); human microarrays; prediction accuracy; Public data; Mathematical models; accuracy; acute granulocytic leukemia; acute lymphoblastic leukemia; article; breast cancer; cancer genetics; DNA microarray; gene mapping; good laboratory practice; prediction; priority journal; Algorithms; Data Interpretation, Statistical; Databases, Factual; Gene Expression Profiling; Humans; Information Storage and Retrieval; Laboratories; Neoplasm Proteins; Neoplasms; Oligonucleotide Array Sequence Analysis; Reproducibility of Results; Sensitivity and Specificity; Tumor Markers, BiologicalCross-generation and cross-laboratory predictions of Affymetrix microarrays by rank-based methodsjournal article10.1016/j.jbi.2007.11.005182345622-s2.0-46649094432