電機資訊學院: 資訊工程學研究所指導教授: 許永真; 李育杰黃喬敬Huang, Chiao-ChingChiao-ChingHuang2017-03-032018-07-052017-03-032018-07-052015http://ntur.lib.ntu.edu.tw//handle/246246/275443本研究基於監督式降維方法: 核化切片逆迴歸 (Kernel Sliced Inverse Regression, KSIR),將其應用於多類別分類及半監督式學習。本研究提出之方法適用於訓練資料不足而類別眾多之分類問題。多類別分類具備兩者以上的類別數量,並通常較難以理解其類別間的關係。基於不同的分類策略以及未辨別的類別關係,常見之分類法以建立並訓練大量的分類器為主,並將其施用至所有已觀測資料上。在第一部份,我們提出基於監督式降維方法 KSIR 之分層式分類,將多類別分類問題分解為一樹狀結構的問題集合,而集合中每個問題具有較少之類別,並產生一適於問題之分類策略。監督式降維方法 KSIR 可將資料投影至其有效維度之子空間,可表現出其類別間的關係並提供用於分解之資訊以建立分層式之樹狀結構。在數個公開資料庫的實驗中,與經典方法比較,我們提出的方法可以大幅降低分類器的數目,在分類預測時的能保有正確率並提高效率。我們並將其應用於兩案例分析,其一進行飛機引擎診斷之多類別分類,在較難辨識的類別中表現出優於它者的分類能力;其二進行場景辨識之公開資料庫,可依此方法找出人可識別之相似類別,並在訓練資料較少之情形下獲得更佳的正確率。 第二部份著重於監督式降維方法 KSIR 轉為半監督式之應用。由於標記資料之昂貴人力物力,眾多真實世界的問題均可應用為半監督式問題。我們將降維方法 KSIR 所需求之統計量,在半監督式問題中以先驗知識進行估計,並利用未標記資料計算廣域統計量,以達到半監督式降維之功效。此方法可近似媲美其它已存方法之準確性,並提高計算之效能。同時於降維空間中利用標記擴散方法,結合提出之階層式分類,有效利用標記資料於標記匱乏之問題。本研究不僅提出多類別分類問題之分類策略,同時過程中之降維空間可利用於標記擴散等方法以增加標記數量用於機器學習。In this thesis, we introduced the application of supervised dimension reduction method: kernel sliced inverse regression (KSIR) on the domain of multiclass classification and semi-supervised learning. Our method performs robustly in the problems with numerous classes and scared training data. Multiclass classification contains more than two classes and is usually complicated to understand the relation and difficult to classify. Common approaches train numerous classifiers applied on all observations according to their strategies and unidentified relations. We proposed a hierarchical classification method which decomposed the multiclass problem into a tree-structured problem set based on KSIR. The data on the effective dimension reduction (e.d.r) subspace constructed by KSIR can be depicted to reveal the relations between classes and decompose the problem, build the hierarchical tree, and reveal the class relations. Our approach provides a good strategy for multiclass classification and decreases the number of applied classifiers. It behaved with comparable accuracy in the experiment of public datasets, compared with the classic method. In our cases study of flight engine diagnosis, resulting in good performance and succeeded in classifying the most indistinguishable classes. The other case study of Scene Classification on public data: SUN presents that our hierarchical decomposition strategy performs robustly in the classifying numerous classes. Our second part is an attempt to draw on the research of applying our method into the semi-supervised domain. Many real world problems can be formulated as semi-supervised problems since the acquisition of labeled data often requires domain knowledge. We proposed a semi-supervised dimension reduction achieved with the KSIR by using prior information to estimate the statistical parameters in the KSIR formula. In the semi-supervised situations, our method provides not only a good strategy for classification but also a suitable subspace for conditioned label spreading. With our hierarchical strategy and label spreading, classification performs better accuracy in the semi-supervised problems.9043384 bytesapplication/pdf論文公開時間: 2017/2/15論文使用權限: 同意有償授權(權利金給回饋學校)降維方法分層分類半監督式分類多類別分類分類策略Dimension reductionHierarchical classificationsemi-supervised classificationmulticlass classificationclassification strategy基於降維方法之分層與半監督式多類別分類Hierarchical and Semi-supervised Multiclass Classification based on the Dimension Reduction Methodthesishttp://ntur.lib.ntu.edu.tw/bitstream/246246/275443/1/ntu-104-D99922016-1.pdf