https://scholars.lib.ntu.edu.tw/handle/123456789/456479
標題: | Cost-sensitive learning for recurrence prediction of breast cancer | 作者: | Cheng T.-H Lan C.-W CHIH-PING WEI Chang H. |
關鍵字: | Breast cancer; Cost-sensitive learning; Data mining; Recurrence prediction; Survival analysis | 公開日期: | 2010 | 起(迄)頁: | 1218-1228 | 來源出版物: | PACIS 2010 - 14th Pacific Asia Conference on Information Systems | 摘要: | Breast cancer is one of the top cancer-death causes and specifically accounts for 10.4% of all cancer incidences among women. The prediction of breast cancer recurrence has been a challenging research problem for many researchers. Data mining techniques have recently received considerable attention, especially when used for the construction of prognosis models from survival data. However, existing data mining techniques may not be effective to handle censored data. Censored instances are often discarded when applying classification techniques to prognosis. In this paper, we propose a cost-sensitive learning approach to involve the censored data in prognostic assessment with better recurrence prediction capability. The proposed approach employs an outcome inference mechanism to infer the possible probabilistic outcome of each censored instance and adopt the cost-proportionate rejection sampling and a committee machine strategy to take into account these instances with probabilistic outcomes during the classification model learning process. We empirically evaluate the effectiveness of our proposed approach for breast cancer recurrence prediction and include a censored-data-discarding method (i.e., building the recurrence prediction model by only using uncensored data) and the Kaplan-Meier method (a common prognosis method) as performance benchmarks. Overall, our evaluation results suggest that the proposed approach outperforms its benchmark techniques, measured by precision, recall and F1 score. |
URI: | https://www.scopus.com/inward/record.uri?eid=2-s2.0-84855985813&partnerID=40&md5=15579aa6127b7ba844d367077c2f10ef https://scholars.lib.ntu.edu.tw/handle/123456789/456479 |
SDG/關鍵字: | Breast Cancer; Censored data; Classification models; Classification technique; Committee machines; Cost-sensitive learning; Data mining techniques; Evaluation results; Inference mechanism; Kaplan-Meier method; Prediction capability; Prediction model; Prognosis models; Research problems; Survival analysis; Survival data; Benchmarking; Costs; Data mining; Forecasting; Information systems; Mathematical models; Diseases |
顯示於: | 資訊管理學系 |
在 IR 系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。