https://scholars.lib.ntu.edu.tw/handle/123456789/484372
標題: | Processing and analysis of imbalanced liver cancer patient data by case-based reasoning | 作者: | Lin, Y.-B. Ping, X.-O. Ho, T.-W. FEI-PEI LAI |
關鍵字: | case-base reasoning; imbalanced dataset; liver cancer; over-sampling; under-sampling | 公開日期: | 2015 | 來源出版物: | BMEiCON 2014 - 7th Biomedical Engineering International Conference | 摘要: | The research on clinical data is one of the fastest growing fields all over the world. In general, most of the data have imbalanced issues, which may cause some problems in the researches. In this study, the methods of over-sampling and under-sampling are used for handling the issues of data imbalanced. The case based reasoning (CBR) is used for developing classification models to predict recurrent statuses of patients with liver cancer. Classification results of these two methods are compared with those of an original imbalanced dataset by the standard indicators, such as sensitivity, specificity, balanced accuracy (BAC), positive predictive value (PPV), negative predictive value (NPV), and accuracy. According to the preliminary results of classification methods, on average, the BAC of balanced methods of the under-sampling (66.07%) and the over-sampling (54.24%) exert a significant improvement compared with the imbalanced grouping dataset (48.33%). Most importantly, the under-sampling method could acquire the highest mean accuracy of the three datasets (under-sampling: 66.76%, over-sampling: 53.47%, imbalanced: 48.58%). In under-sampling method, mean PPV, NPV, and accuracy are higher than 65% (PPV: 65.44%, NPV: 69.44%, accuracy: 66.76%). The balanced datasets can provide benefits for classification models and efficiently reduce biased interpretations. ? 2014 IEEE. |
URI: | https://scholars.lib.ntu.edu.tw/handle/123456789/484372 | DOI: | 10.1109/BMEiCON.2014.7017371 | SDG/關鍵字: | Biomedical engineering; Classification (of information); Clinical research; Diseases; Hospital data processing; Case-base reasonings; Imbalanced dataset; Liver cancers; Over sampling; Under-sampling; Case based reasoning |
顯示於: | 生醫電子與資訊學研究所 |
在 IR 系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。