支持向量機在多標籤分類的研究

林智仁臺灣大學：資訊工程學研究所蘇玟賢Su, Wen-HsienWen-HsienSu2007-11-262018-07-052007-11-262018-07-052006http://ntur.lib.ntu.edu.tw//handle/246246/54120多標籤分類 (multi-label classification) 在機器學習的領域上是一項重要的課題。目前已經有數種解決方法被提出。在本篇論文中，我們探討以支持向量機 (Support Vector Machine) 為基礎的解決方案。由於多標籤分類可以視為是多類分類 (multi-class classification) 的延伸，因此，可以藉由修改解決多類分類問題的方法，來達到解決多標籤分類問題的目標。「標籤組合」 (label combination) 是一個將每一種標籤組合視為單一類別的方法，因此解決多類分類問題的方法可以直接使用。本論文探討三種多標籤分類的解決方法：「雙類比對」 (binary)、「標籤組合」以及「最大邊際規劃」 (maximal margin formulation)，其中「標籤組合」搭配另外三種多類分類問題的解決方法：一對一，及兩種一次解決方案。由實驗結果得知，「標籤組合」搭配「一次解決方案」在中小型問題上可以得到較好的效能，而在大型問題上，「標籤組合」搭配「一對一」也會有不錯的成果，因此，「標籤組合」在多標籤分類的問題上是一項實用的解決方案。Multi-label classification is an important subject in machine learning. There are several available ways to handle such problems. In this thesis we focus on using support vector machines (SVMs). As multi-label classification can be treated as an extension of multi-class classification, it is natural to modify multi-class approaches for multi-label problems. The thesis considers three extensions: “binary,” “label combination” and maximal margin formulation. We give comprehensive experiments to check their performances. In addition, we also give detailed derivations and investigate the implementation details. As “label combination” is a way that treats each subset of labels as an individual SVM class, any multi-class method can be directly applied. We discuss several methods of this type. They are “one-against-one,” “approach in [45, 46],” and “method by Crammer and Singer.” We compare and analyze their performances. The last two methods both solve a single optimization problem in training. We find that they perform well when the size of the data is not large. They are however not suitable for very large problems due to lengthy training time. In such situations, the “label combination” approach via “one-against-one” multi-class implementation is an effective solution. Overall we find that the method “label combination” to directly transform multi-label to multi-class is a practically viable technique.TABLE OF CONTENTS ABSTRACT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii ACKNOWLEDGEMENTS . . . . . . . . . . . . . . . . . . . . . . . . . . iv LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii LIST OF TABLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii CHAPTER I. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 II. Support Vector Machines . . . . . . . . . . . . . . . . . . . . . . 5 2.1 Basic Concepts of SVM . . . . . . . . . . . . . . . . . . . . . 5 2.2 Decomposition Methods . . . . . . . . . . . . . . . . . . . . . 9 2.3 Working Set Selection . . . . . . . . . . . . . . . . . . . . . . 11 III. Methods for Multi-class SVM . . . . . . . . . . . . . . . . . . . 14 3.1 One-against-the rest Method . . . . . . . . . . . . . . . . . . 14 3.2 One-against-one Method . . . . . . . . . . . . . . . . . . . . . 15 3.3 Approach in [45, 46] . . . . . . . . . . . . . . . . . . . . . . . 16 3.4 Method by Crammer and Singer . . . . . . . . . . . . . . . . 19 IV. Methods for Multi-label SVM . . . . . . . . . . . . . . . . . . . 21 4.1 Binary Method . . . . . . . . . . . . . . . . . . . . . . . . . . 21 4.2 Method of Using All Label Combinations . . . . . . . . . . . 22 4.3 Maximal Margin Formulation . . . . . . . . . . . . . . . . . . 23 4.3.1 Derivation . . . . . . . . . . . . . . . . . . . . . . . 23 4.3.2 Implementation . . . . . . . . . . . . . . . . . . . . 30 4.3.2.1 Approximation in Training . . . . . . . . 30 4.3.2.2 Decomposition . . . . . . . . . . . . . . 30 4.3.2.3 Polynomial Time for Prediction . . . . . 34 V. Numerical Experiments on Multi-label SVM . . . . . . . . . . 38 5.1 Evaluation Measures . . . . . . . . . . . . . . . . . . . . . . . 38 5.2 Comparison for MML and “Label Combination” . . . . . . . 40 5.2.1 Real-World Data Sets . . . . . . . . . . . . . . . . . 40 5.2.2 Experimental Settings . . . . . . . . . . . . . . . . . 42 5.2.3 Results and Discussion . . . . . . . . . . . . . . . . 43 5.3 Further Comparisons . . . . . . . . . . . . . . . . . . . . . . . 47 5.3.1 Introduction of Data Sets . . . . . . . . . . . . . . . 47 5.3.2 Experimental Settings . . . . . . . . . . . . . . . . . 48 5.3.3 Results and Discussion . . . . . . . . . . . . . . . . 50 5.4 Experiments on Large Data Sets . . . . . . . . . . . . . . . . 56 5.4.1 Experiment on Full OHSUMED . . . . . . . . . . . . 56 5.4.2 Experiment on Full RCV1-V2 . . . . . . . . . . . . . 58 5.4.3 Analysis for RCV1-V2 . . . . . . . . . . . . . . . . . 59 VI. Conclusions and Discussions . . . . . . . . . . . . . . . . . . . . 64 BIBLIOGRAPHY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66535658 bytesapplication/pdfen-US支向機支持向量機多標籤分類support vector machinemulti-label classification支持向量機在多標籤分類的研究Support Vector Machines for Multi-label Classificationthesishttp://ntur.lib.ntu.edu.tw/bitstream/246246/54120/1/ntu-95-P92922007-1.pdf