A composite model for subgroup identification and prediction via bicluster analysis

Chen H.-C.;Zou W.;Tzu-Pin Lu;Chen J.J.

DC 欄位	值	語言
dc.contributor.author	Chen H.-C.	en_US
dc.contributor.author	Zou W.	en_US
dc.contributor.author	TZU-PIN LU	en_US
dc.contributor.author	Chen J.J.	en_US
dc.creator	Chen H.-C.;Zou W.;Tzu-Pin Lu;Chen J.J.	-
dc.date.accessioned	2020-11-17T02:45:32Z	-
dc.date.available	2020-11-17T02:45:32Z	-
dc.date.issued	2014	-
dc.identifier.issn	1932-6203	-
dc.identifier.uri	https://www.scopus.com/inward/record.uri?eid=2-s2.0-84908635215&doi=10.1371%2fjournal.pone.0111318&partnerID=40&md5=bdb072aea9eda30e2a942616c6a864ab	-
dc.identifier.uri	https://scholars.lib.ntu.edu.tw/handle/123456789/520974	-
dc.description.abstract	Conclusion: The composite model presents a novel approach to developing a biclustering-based classification model from unlabeled sampled data. The proposed approach combines unsupervised biclustering and supervised classification techniques to classify samples into disjoint subgroups based on their associated attributes, such as genotypic factors, phenotypic outcomes, efficacy/safety measures, or responses to treatments. The procedure is useful for identification of unknown species or new biomarkers for targeted therapy.Background: A major challenges in the analysis of large and complex biomedical data is to develop an approach for 1) identifying distinct subgroups in the sampled populations, 2) characterizing their relationships among subgroups, and 3) developing a prediction model to classify subgroup memberships of new samples by finding a set of predictors. Each subgroup can represent different pathogen serotypes of microorganisms, different tumor subtypes in cancer patients, or different genetic makeups of patients related to treatment response.Methods: This paper proposes a composite model for subgroup identification and prediction using biclusters. A biclustering technique is first used to identify a set of biclusters from the sampled data. For each bicluster, a subgroup-specific binary classifier is built to determine if a particular sample is either inside or outside the bicluster. A composite model, which consists of all binary classifiers, is constructed to classify samples into several disjoint subgroups. The proposed composite model neither depends on any specific biclustering algorithm or patterns of biclusters, nor on any classification algorithms.Results: The composite model was shown to have an overall accuracy of 97.4% for a synthetic dataset consisting of four subgroups. The model was applied to two datasets where the sample's subgroup memberships were known. The procedure showed 83.7% accuracy in discriminating lung cancer adenocarcinoma and squamous carcinoma subtypes, and was able to identify 5 serotypes and several subtypes with about 94% accuracy in a pathogen dataset.	-
dc.language.iso	English	-
dc.publisher	Public Library of Science	-
dc.relation.ispartof	PLoS ONE	-
dc.subject.classification	[SDGs]SDG3	-
dc.subject.other	biological marker; tumor marker; Article; breast cancer; cancer classification; cancer patient; classification algorithm; cluster analysis; diagnostic accuracy; diagnostic test accuracy study; diagonal linear discriminant analysis; genotype; human; lung adenocarcinoma; lung cancer; lung squamous cell carcinoma; nonhuman; phenotype; prediction; random forest; Salmonella; sensitivity and specificity; serotype; support vector machine; algorithm; classification; cluster analysis; information processing; Algorithms; Biomarkers, Tumor; Cluster Analysis; Datasets as Topic; Humans	-
dc.title	A composite model for subgroup identification and prediction via bicluster analysis	en_US
dc.type	journal article	en
dc.identifier.doi	10.1371/journal.pone.0111318	-
dc.identifier.pmid	25347824	-
dc.identifier.scopus	2-s2.0-84908635215	-
dc.relation.journalvolume	9	-
dc.relation.journalissue	10	-
item.languageiso639-1	English	-
item.cerifentitytype	Publications	-
item.fulltext	no fulltext	-
item.openairecristype	http://purl.org/coar/resource_type/c_6501	-
item.openairetype	journal article	-
item.grantfulltext	none	-
crisitem.author.dept	Institute of Health Data Analytics and Statistics	-
crisitem.author.dept	Public Health	-
crisitem.author.orcid	0000-0003-3697-0386	-
crisitem.author.parentorg	College of Public Health	-
crisitem.author.parentorg	College of Public Health	-
顯示於：	流行病學與預防醫學研究所

顯示文件簡單紀錄

SCOPUS^TM
Citations

checked on 2024/2/8

WEB OF SCIENCE^TM
Citations

checked on 2023/12/4

Page view(s)

checked on 2024/4/27

Google Scholar^TM

檢查

Altmetric

TAIR相關文章

SCOPUSTM Citations

WEB OF SCIENCETM Citations

Page view(s)

Google ScholarTM

Altmetric

Altmetric

SCOPUS^TM
Citations

WEB OF SCIENCE^TM
Citations

Google Scholar^TM