Su, Bo-HanBo-HanSuTu, Yi-ShuYi-ShuTuLin, Olivia A.Olivia A.LinHarn, Yeu-ChernYeu-ChernHarnShen, Meng-YuMeng-YuShenYUFENG JANE TSENG2018-09-102018-09-10201515499596http://www.scopus.com/inward/record.url?eid=2-s2.0-84923343419&partnerID=MN8TOARShttp://scholars.lib.ntu.edu.tw/handle/123456789/390818Fluorescence-based detection has been commonly used in high-throughput screening (HTS) assays. Autofluorescent compounds, which can emit light in the absence of artificial fluorescent markers, often interfere with the detection of fluorophores and result in false positive signals in these assays. This interference presents a major issue in fluorescence-based screening techniques. In an effort to reduce the time and cost that will be spent on prescreening of autofluorescent compounds, in silico autofluorescence prediction models were developed for selected fluorescence-based assays in this study. Five prediction models were developed based on the respective fluorophores used in these HTS assays, which absorb and emit light at specific wavelengths (excitation/emission): Alexa Fluor 350 (A350) (340 nm/450 nm), 7-amino-4-trifluoromethyl-coumarin (AFC) (405 nm/520 nm), Alexa Fluor 488 (A488) (480 nm/540 nm), Rhodamine (547 nm/598 nm), and Texas Red (547 nm/618 nm). The C5.0 rule-based classification algorithm and PubChem 2D chemical structure fingerprints were used to develop prediction models. To optimize the accuracies of these prediction models despite the highly imbalanced ratio of fluorescent versus nonfluorescent compounds presented in the collected data sets, oversampling and undersampling strategies were applied. The average final accuracy achieved for the training set was 97%, and that for the testing set was 92%. In addition, five external data sets were used to further validate the models. Ultimately, 14 representative structural features (or rules) were determined to efficiently predict autofluorescence in data sets containing both fluorescent and nonfluorescent compounds. Several cases were illustrated in this study to demonstrate the applicability of these rules. © 2015 American Chemical Society.Fluorescence; Fluorophores; Forecasting; Autofluorescent compounds; Fluorescence-based detection; Fluorescent markers; High-throughput screening; Non-fluorescent compounds; Rule-based classification; Screening techniques; Structural feature; Predictive analytics; fluorescent dye; algorithm; chemical model; chemistry; classification; cluster analysis; computer simulation; fluorescence; fuzzy logic; high throughput screening; machine learning; predictive value; procedures; quantitative structure activity relation; structure activity relation; Algorithms; Cluster Analysis; Computer Simulation; Fluorescence; Fluorescent Dyes; Fuzzy Logic; High-Throughput Screening Assays; Machine Learning; Models, Chemical; Predictive Value of Tests; Quantitative Structure-Activity Relationship; Structure-Activity RelationshipRule-based classification models of molecular autofluorescencejournal article10.1021/ci5007432256257682-s2.0-84923343419