支撐向量法之研究

林智仁臺灣大學：資訊工程學研究所陳培勳Chen, Pai-HsuenPai-HsuenChen2007-11-262018-07-052007-11-262018-07-052007http://ntur.lib.ntu.edu.tw//handle/246246/54049支撐向量法(support vector machines) 是資料分類(data classification) 的一種有效的方法.它不僅有紮實的理論基礎而且應用於實際問題不輸於類神經網路(neural networks) 或決策樹(decision trees)等其他方法.這篇論文研究加快支撐向量法的訓練及發展新的應用. 分段法(decomposition methods)是目前訓練支撐向量法主要的方法之一.各種分段法主要差別在於計算過程變數選擇(working set selection)的方式.目前的方法及分析通常僅考慮特定的選取方式. 本論文對支撐向量法的依序最佳化(Sequential Minimal Optimization, SMO)分段法提出一般性的選擇方式並作完整的探討. 主要結果包括1)簡明的極限收斂證明(asymptotic convergence proof), 2)快取(caching)及變數移除(shrinking)的理論說明,及3)直線收斂速率(linear convergence)的證明.並討論其他支撐向量法的變形.因此所有合乎本文的變數選擇方式都能應用所有的理論結果. 在實作上SMO分段法主要選擇抵觸最佳解的變數,這和利用目標函數(objective function)一級差異(first order information) [即梯度(gradient)]是相同的.經由先前所述一般化的變數選取探討, 本論文提出利用二級差異(second order information)的方法.這方法能符合先前所述的理論性質,而且此方法也能應用於非有限核心陣列(indefinite kernel matrices).實驗證明新的方法比原先的方法更快速. 最後我們將支撐向量法應用在醫學影像上脊柱(spinal cord)的標定.過去的做法通常需要專家提供具體的知識.用支撐向量法可以建立自動的系統.支撐向量法標定的結果由專家評定為成功,需修改,或失敗.經由漸進式學習(incremental learning)加強,支撐向量法可得到94.67%的成功率, 4.47%需修改,及0.96%失敗率.支撐向量法在醫學影像標定上會是有用的方法.Support vector machines (SVMs) have been an effective technique for data classification.Not only it has a solid theoretical foundation, practical comparisons have also shown that SVM is competitive with existing methods such as neural networks and decision trees. This thesis studies fast training of SVM and develops new SVM applications. Decomposition methods are currently one of the major methods for training support vector machines. They vary mainly according to different working set selections. Existing implementations and analysis usually consider some specific selection rules. This thesis first gives a comprehensive study on Sequential Minimal Optimization (SMO)-type decomposition methods under a general and flexible way of choosing the working set. Main results include: 1) a simple asymptotic convergence proof, 2) a general explanation of the shrinking and caching techniques, and 3) the linear convergence of the method. Extensions to some SVM variants are also discussed. Thus, all results here apply to any SMO-type implementation whose selection rule meets the criteria of this work. For practical implementations, SMO-decomposition methods mainly select working sets via the violation of the optimality condition, which also corresponds to first order (i.e., gradient) information of the objective function. Following the general framework of SMO-type methods discussed in this thesis, we develop a simple working set selection using second order information. Thus, theoretical properties such as linear convergence are established. It can be extended for indefinite kernel matrices. Experiments demonstrate that the proposed method is faster than existing selection methods using first order information. We then apply SVMs to spinal canal segmentation, a medical image application. Past work usually requires substantial knowledge form human experts, but using SVM, we build an automatic system. Predicted results are evaluated by human experts and classified as successful, editable or unsuccessful. We can further improve SVM classifiers by incremental learning, where new training data are added and unnecessary ones are removed. The SVM classifier achieves 94.67% successful, 4.47% editable, and 0.96%unsuccessful rate without human interventions. Our system demonstrates that SVM is a useful tool for medical image segmentation.口試委員會審定書 i 中文摘要 ii ABSTRACT iii LIST OF FIGURES viii LIST OF TABLES x CHAPTER I. Introduction 1 II. Basic Concepts of SVM 3 2.1 Linear Separating Hyperplane with Maximal Margin 3 2.2 Mapping Data to Higer Dimensional Spaces 4 2.3 The Dual Problem 7 2.4 Kernel and Decision Functions 9 2.5 Parameter Selection 11 III. A Study on SMO-type Decomposition Methods for Support Vector Machines 13 3.1 SMO-type Decomposition Methods 13 3.2 Existing and New Working Set Selection for SMO-type Methods 16 3.2.1 Existing Selections 16 3.2.2 A General Working Set Selection for SMO-type De- composition Methods 19 3.3 Asymptotic Convergence 21 3.3.1 Solving the sub-problem 21 3.3.2 Proof of Asymptotic Convergence 26 3.4 Stopping Condition, Shrinking and Caching 32 3.4.1 Stopping Condition and Finite Termination 35 3.4.2 Shrinking and Caching 38 3.5 Convergence Rate 42 3.6 Extensions 48 3.6.1 Support Vector Regression and One-class SVM 48 3.6.2 Extension to u-SVM 51 3.7 Discussion and Conclusions 54 3.7.1 Faster Training via a Better Selection Rule 54 3.7.2 Necessary Conditions for the Convergence 55 3.8 Supplement Material 57 3.8.1 Counter Example of ALAL^*=0 58 3.8.2 Counter Example of Theorem 1 59 IV. Working Set Selection Using Second Order Information for Training Support Vector Machines 61 4.1 Existing and New Working Set Selections 61 4.1.1 A New Working Set Selection 63 4.1.2 Non-Positive Definite Kernel Matrices 66 4.2 Theoretical Properties 69 4.3 Experiments 72 4.3.1 Data and Experimental Settings 73 4.4 Maintaining Feasibility in Sub-problems for Working Set Se- selections 83 4.4.1 Solutions of (4.5) and (4.22) in final iterations 84 4.4.2 Experiments 87 4.4.3 Sub-problems Using First Order Information 89 4.5 Discussion and Conclusions 90 4.6 Supplement Material 91 4.6.1 WSS 1 Solves Problem (4.1): the Proof 91 4.6.2 Pseudo Code of Algorithms 3 and WSS 6 92 V. Automatic Spinal Canal Segmentation of CT Images by Support Vector Machines 95 5.1 Medical Image Segmentation 96 5.2 Methods and Materials 98 5.2.1 Data 99 5.2.2 Image Features as Trainging Data 100 5.2.3 Training/Prediction via SVM 103 5.2.4 Post Processing 105 5.2.5 Evaluation 105 5.2.6 Incremental Learning 106 5.2.7 Testing Variants of CT Images 106 5.3 Results 107 5.3.1 Performance of Different Training Settings 107 5.3.2 Improving Performance by Incremental Learing 108 5.3.3 Testing CT Images from Other Hospitals 110 5.4 Conclusions and Discussion 111 VI. Discussion and Conclusions 1131319295 bytesapplication/pdfen-US支撐向量法資料分類分段法依序最佳化醫學影像脊柱support vector machinesdata classificationdecompostion methodssequential minimal optimizationmedical imagesspinal cord支撐向量法之研究A Study on Support Vector Machinesthesishttp://ntur.lib.ntu.edu.tw/bitstream/246246/54049/1/ntu-96-F90922008-1.pdf