利用 Sherman-Woodbury 公式之高效率多類別非線性最小平方誤差分類器

陳正剛臺灣大學：工業工程學研究所李振維Li, Chen-WeiChen-WeiLi2007-11-262018-06-292007-11-262018-06-292006http://ntur.lib.ntu.edu.tw//handle/246246/51165線性分類方法大致上包含最小平方誤差法(Minimum Squared Error)和費雪線性區別(Fisher Linear Discriminant)。由於這兩種方法都是線性分類方法，為了能夠處理蘊含非線性特徵的實例(instance)，因此從最小平方誤差法衍生出以核心函數(kernel function)為根基的核心最小平方誤差法(Kernel Minimum Squared Error)，另外也從費雪線性區別衍生出以核心函數為根基的核心費雪區別(Kernel Fisher Discriminant)。這兩種方法都是把實例從原本的屬性空間(attribute space)投射到一個較高維度的特徵空間(feature space)裡，並在特徵空間裡使用線性分類方法。費雪線性區別和核心費雪區別的目的都是尋找一組方向使得訓練實例(training instances)投影在上面的區別分數(discriminant scores)可以提供最大的鑑別力來區分所有類別。然而，當資料包含大量的屬性或實例時，費雪線性區別和核心費雪區別將會很沒有效率。為了改善運算效能，我們用最小平方誤差法來處理線性分類問題。但在面對多類別的問題時，最小平方誤差法就會跟支持向量機器(Support Vector Machine)一樣，使用一對一(one-against-one)或一對多(one-against-the-rest)的方法。兩者都是無效率的方法，因為無法跟費雪線性區別和核心費雪區別一樣只用一個模型就可以處理多類別的問題。因此我們發展多類別最小平方誤差法(multi-class MSE)，使用雪曼-伍德布瑞(Sherman-Woodbury)公式來改善運算效能，並利用葛蘭-舒密特過程(Gram-Schmidt process)來決定類別標籤組合(class-labeling scheme)以處理多類別的問題。多類別最小平方誤差法的非線性應用稱作多類別核心最小平方誤差法(multi-class KMSE)。接著我們用一筆模擬的範例來描述這個方法的流程以及表現類別標籤組合的意義。最後我們用兩筆真實資料來對我們所提出的方法和傳統的分類方法進行比較。In general, there are two kinds of linear classification methods: one is MSE, and the other is FLD. Because linear methods are not sufficient to analyze the data with nonlinear patterns, the nonlinear methods KMSE and KFD are hence developed from MSE and FLD, respectively. Both transform the instances from the original attribute space to the high-dimensional feature space and then linear methods are applied. The objective of FLD and KFD is to find the directions on which the projection of training instances can provide the maximal separability of classes. FLD and KFD are known to be inefficient for datasets with a large amount of attributes and instances, respectively. To improve the computing efficiency, we use MSE for linear classification problems. However, MSE, like SVM, can use only the one-against-one or the one-against-the-rest approach to solve the multi-class problems. Both are inefficient compared to FLD and KFD where only one model is built to discriminate multiple classes simultaneously. Thus, we develop the multi-class MSE with Sherman-Woodbury formula to improve the computation efficiency. It can deal with multiple classes simultaneously by a class-labeling scheme. The different class-labeling schemes are determined by the Gram-Schmidt process. The nonlinear application, multi-class KMSE, is also developed from the multi-class MSE. Then, a simulated example is used to show how the proposed method works and to visualize the meaning of the class-labeling scheme. Finally, two real-world datasets are used for comparing the proposed method with other conventional methods.Abstract i 論文摘要 ii Contents iii Contents of Figures iv Contents of Tables v Chapter 1: Introduction 1 1.1 Background 1 1.2 Current Linear Approaches for Classification 2 1.2.1 Fisher Linear Discriminants 2 1.2.2 Minimum Squared Error Approach 4 1.3 Kernel Fisher Discriminants 7 1.4 Problems of Current Linear and Nonlinear Classification Approaches and Research Objectives 11 1.5 Thesis Organization 11 Chapter 2: Multi-Class Kernel MSE with Sherman-Woodbury Formula 13 2.1 Multi-class Minimum Squared Error Approach 13 2.2 Multi-class Kernel Minimum Squared Error Approach 20 2.3 Determination of the Optimal Class-Labeling Scheme 24 2.3.1 Gram-Schmidt Process 27 2.4 Sherman-Woodbury Formula 30 2.5 Illustration with a Simulated Example 31 Chapter 3: Case Study 38 3.1 Medline Text Dataset 38 3.2 Hayes-Roth Dataset 46 Chapter 4: Conclusions and Suggestions on Future Research 51 References 52 Appendix A: Principal Component Analysis 54 Appendix B: C# Code 571538452 bytesapplication/pdfen-US分類方法費雪線性區別核心費雪區別最小平方誤差法核心最小平方誤差法效率Classification methodFLDKFDMSEKMSEEfficiency利用 Sherman-Woodbury 公式之高效率多類別非線性最小平方誤差分類器Effective Multi-class Kernel MSE Classifier with Sherman-Woodbury Formulathesishttp://ntur.lib.ntu.edu.tw/bitstream/246246/51165/1/ntu-95-R93546018-1.pdf