Publication: A Decision Tree Classifier for Big Data Analytics on Credit Assessment Problem
dc.contributor | 指導教授:陳靜枝 | |
dc.contributor | 臺灣大學:資訊管理學研究所 | zh_TW |
dc.contributor.author | Lei, Weng-U | en |
dc.creator | Lei, Weng-U | en |
dc.date | 2014 | |
dc.date.accessioned | 2014-11-29T11:51:34Z | |
dc.date.accessioned | 2018-06-29T13:15:29Z | |
dc.date.available | 2014-11-29T11:51:34Z | |
dc.date.available | 2018-06-29T13:15:29Z | |
dc.date.issued | 2014 | |
dc.description.abstract | Credit assessment has been a large-scale problem among finance institutes. Their demand in reducing risk of customer debt can be achieved by applying data mining techniques to determine whether a new application should be approved or not. The problem, however, is actually under a Big Data environment. Complicated preprocessing steps are required because of the vast and messy data sources. The study proposes a Decision-Tree-Based Credit Assessment Approach (DTCAA) to solve the problem. Decision tree model is selected because of its interpretability and easily understanding rules, as well as its competitive performance. Additionally, the approach also includes various methods for data preprocessing. The consolidations can reduce messiness of the raw data, facilitating the implementation process. By acquiring the real data from one of the three biggest car collateral loan companies in Taiwan, the experiments indicate that decision Tree is competitive among various situations. Within multiple factors, the experiments suggest the usability of DTCAA in practice. | en |
dc.description.tableofcontents | Contents i List of Figures iv List of Tables v Chapter 1 Introduction 1 1.1 Motivation 1 1.2 Objective 5 1.3 Scope 6 Chapter 2 Literature Review 7 2.1 Credit Assessment Problems 7 2.2 Supervised Learning / Classification 9 2.3 Decision Tree 13 2.4 Conclusion 16 Chapter 3 Problem Description 18 3.1 The Credit Assessment Problem 18 3.2 Classification and Decision Tree 20 3.3 The Big Data Environment 24 3.4 Problem Statement 25 3.5 Summary 27 Chapter 4 The Decision-Tree Based Credit Assessment Approach 29 4.1 Step 1: Data Analysis and Preprocessing 30 4.1.1 Defining the Target Variable 31 4.1.2 Consolidating data 34 4.1.3 Data Sampling and Attribute Selection 40 4.1.4 Data Partition 43 4.2 Step 2: Decision Tree Models Building 44 4.2.1 Model Building 46 4.2.2 Model Assessment 47 4.3 Step 3: Data Prediction and Scoring 48 4.4 Complexity 48 Chapter 5 Computational Analysis 50 5.1 Data Description 50 5.2 Factors 52 5.2.1 Target Variable 52 5.2.2 Different Multi-Class Approaches 53 5.2.3 Variable Selection 54 5.3 Experiments 54 5.3.1 Case 1: Balance Dataset with 1 run 57 5.3.2 Case 2: Balance Dataset with 30 runs 62 5.3.3 Case 3: Imbalance Dataset with 1 run 67 5.3.4 Case 4: Imbalance Dataset with 30 runs 72 5.4 Summary 76 Chapter 6 Conclusion and Future Work 78 6.1 Conclusion 78 6.2 Future Work 79 Reference 81 | zh_TW |
dc.format.extent | 10687378 bytes | |
dc.format.mimetype | application/pdf | |
dc.identifier.uri | http://ntur.lib.ntu.edu.tw//handle/246246/263483 | |
dc.identifier.uri.fulltext | http://ntur.lib.ntu.edu.tw/bitstream/246246/263483/1/ntu-103-R01725019-1.pdf | |
dc.language | en_US | |
dc.rights | 論文公開時間:2024/12/31 | |
dc.rights | 論文使用權限:同意有償授權(權利金給回饋學校) | |
dc.subject | 信用評估 | zh_TW |
dc.subject | 決策樹 | zh_TW |
dc.subject | 巨量資料 | zh_TW |
dc.subject | 海量資料 | zh_TW |
dc.subject | 大數據 | zh_TW |
dc.subject | 資料探勘 | zh_TW |
dc.subject | 資料整合 | zh_TW |
dc.title | A Decision Tree Classifier for Big Data Analytics on Credit Assessment Problem | en |
dc.type | thesis | en |
dspace.entity.type | Publication |
Files
Original bundle
1 - 1 of 1