降維方法在電腦視覺之應用

傅楸善臺灣大學：資訊工程學研究所陳煥宗Chen, Hwann-TzongHwann-TzongChen2007-11-262018-07-052007-11-262018-07-052006http://ntur.lib.ntu.edu.tw//handle/246246/53732The thesis concerns a manifold-learning view on performing dimensionality reduction for applications in computer vision. We examine the recent progress in this field, and set out on better advancing the existing techniques and broadening the related vision applications, including face and texture recognition, tracking, and content-based image retrieval. Previous studies on reducing the dimensionality of image data often emphasize maintaining the principal image characteristics or capturing the discriminant image structure if the class labels of data are given. The popular subspace approaches are good examples to typify efforts along this line. However, such methods do not take account of how image data are scattered in a high-dimensional space, a useful property for solving many vision problems. Alternatively, manifold learning can be used to uncover the data's underlying manifold structure, while reducing the dimensionality. Our approach achieves this effect by re-embedding the manifold structure of data into a low-dimensional space. Depending on the formulation of a vision problem, a task-related neighborhood for every data point in the embedding space is preserved such that the nearest neighbor criterion in the low-dimensional embedding space becomes more reliable. In this work, we describe three manifold-learning frameworks, each of which is formulated for specific vision applications, and makes use of available information, such as class labels, pairwise similarity relations, or relevance feedback, in deriving the low-dimensional embeddings. The efficiency of our algorithms on learning the manifolds stems from solving generalized eigenvalue problems, whose solutions can be readily computed by existing numerical methods. Among other important issues, we also propose useful insights and new techniques on how to appropriately represent an image as a point in the high-dimensional space, and how to measure the distance between two points in accordance with the similarity relation entailed by their corresponding images. All these efforts further boost the performance of our manifold-learning methods, where we shall illustrate their advantages with extensive experimental results.1 Introduction 1 1.1 Manifold Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.1.1 Isometric Feature Mapping . . . . . . . . . . . . . . . . . . . 4 1.1.2 Locally Linear Embedding . . . . . . . . . . . . . . . . . . . . 5 1.2 Overview of the Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2 Learning with Labels 9 2.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.1.1 Dimensionality Reduction . . . . . . . . . . . . . . . . . . . . 11 2.1.2 Discriminant Analysis . . . . . . . . . . . . . . . . . . . . . . 12 2.1.3 Kernel Methods and Matrix-Based Representations . . . . . 12 2.1.4 Manifold Learning for Classification . . . . . . . . . . . . . . . 13 2.2 Local Discriminant Embedding . . . . . . . . . . . . . . . . . . . . . 14 2.2.1 LDE Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . 15 2.2.2 Justifications for LDE . . . . . . . . . . . . . . . . . . . . . . 17 2.3 Generalizations for LDE . . . . . . . . . . . . . . . . . . . . . . . . . 23 2.3.1 Two-Dimensional LDE . . . . . . . . . . . . . . . . . . . . . . 23 2.3.2 Kernel LDE . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 2.3.3 LDE versus LDA . . . . . . . . . . . . . . . . . . . . . . . . . 29 2.4 Face Recognition and LDE . . . . . . . . . . . . . . . . . . . . . . . 33 2.4.1 The Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 2.4.2 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 2.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 3 Learning with Few Examples 47 3.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 3.1.1 Learning Distance Metrics . . . . . . . . . . . . . . . . . . . . 48 3.1.2 A Connection to Feature Selection . . . . . . . . . . . . . . . 49 3.1.3 A New Approach . . . . . . . . . . . . . . . . . . . . . . . . . 50 3.2 Glocal Image Representations . . . . . . . . . . . . . . . . . . . . . 51 3.3 Learning Image Metrics . . . . . . . . . . . . . . . . . . . . . . . . . 53 3.4 Algorithm: Bilinear-Glocal (BiGL) Image Metrics . . . . . . . . . . . 57 3.5 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 3.5.1 Nearest Neighbor Classifications . . . . . . . . . . . . . . . . 58 3.5.2 Multi-Object Tracking . . . . . . . . . . . . . . . . . . . . . . . 67 3.6 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 4 Learning with User Semantics 71 4.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 4.1.1 Global and Local Image Features . . . . . . . . . . . . . . . . 72 4.1.2 Relevance Feedback . . . . . . . . . . . . . . . . . . . . . . . 72 4.1.3 Manifold Ways . . . . . . . . . . . . . . . . . . . . . . . . . . 73 4.1.4 Semantic Manifolds for Relevance Feedback . . . . . . . . . 74 4.2 Semantic Manifold Learning . . . . . . . . . . . . . . . . . . . . . . . 75 4.2.1 Augmented Relation Embedding . . . . . . . . . . . . . . . . 76 4.2.2 Kernel ARE . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 4.3 Image Representations for Image Retrieval . . . . . . . . . . . . . . 82 4.3.1 Global Features for CBIR . . . . . . . . . . . . . . . . . . . . 83 4.3.2 Local Features for CBIR . . . . . . . . . . . . . . . . . . . . . 84 4.3.3 Earth Mover’s Distance . . . . . . . . . . . . . . . . . . . . . 86 4.3.4 Augmented Features for CBIR . . . . . . . . . . . . . . . . . 87 4.4 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 4.4.1 The Image Dataset . . . . . . . . . . . . . . . . . . . . . . . . 91 4.4.2 Evaluation and Implementation Settings . . . . . . . . . . . . 91 4.4.3 Image Features for ARE . . . . . . . . . . . . . . . . . . . . . 93 4.4.4 Manifold-Learning Schemes . . . . . . . . . . . . . . . . . . . 94 4.4.5 Embedding Dimensions . . . . . . . . . . . . . . . . . . . . . 97 4.4.6 Visualization of Semantics . . . . . . . . . . . . . . . . . . . . 99 4.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 5 Conclusions 103 5.0.1 Future Directions . . . . . . . . . . . . . . . . . . . . . . . . . 106 Bibliography 1093069278 bytesapplication/pdfen-US降維方法電腦視覺Dimensionality ReductionComputer VisionManifold Learning降維方法在電腦視覺之應用Dimensionality Reduction for Vision Applications: A Manifold-Learning Approachthesishttp://ntur.lib.ntu.edu.tw/bitstream/246246/53732/1/ntu-95-D88526014-1.pdf