基於影片資訊之衣服檢索系統

指導教授：歐陽明臺灣大學：資訊網路與多媒體研究所王詩涵Wang, Shih-HanShih-HanWang2014-11-292018-07-052014-11-292018-07-052013http://ntur.lib.ntu.edu.tw//handle/246246/263455在現今社會中, 隨著消費型態的改變, 服裝檢索的需求在許多知名服裝購物網站正快速提升當中。有別於一般的關鍵字搜尋, 以圖搜尋不僅能提供更直覺, 更有趣的服裝推薦系統, 甚至有助於身份或職業辨識的應用。在近期的服裝檢索研究主題, 以圖搜尋也成為主要的研究主題之一。在本篇論文中, 我們提出另一種新型態的服裝推薦介面- 基於影像資訊的服裝檢索系統。使用者可以選擇在影片片段中依據喜歡的主角服裝按下暫停鍵, 系統會自動找出在線上網站的相似款式服裝。然而, 這個服裝檢索系統仍面臨許多研究問題, 例如人體姿勢偵測, 服裝檢索系統的即時性等等, 其中我們特別在本篇研究中探討的分別為針對不準確的人體姿勢偵測作修正以及如何在大量擁有複雜的背景的線上資料中找出相似的衣服。首先, 我們提出一個結合少量過去影片片段的人體姿勢偵測機制來修正不準確的姿勢偵測結果; 在有正確姿勢的前提下, 我們利用圖像切割演算法設計一個全自動的前景切割機制以解決大量資料中背景多樣性的問題。我們藉由蒐集數段影片和各種不同的線上購物網站資料來評估我們的各個機制, 並在最後的實驗結果中, 成功的藉由基於影片資訊改善人體姿勢的偵測以及利用全自動的前景切割解決複雜背景的問題。Nowadays, clothing retrieval becomes a thriving demand for online clothing shopping websites. Beyond keyword-based clothing search, image-based clothing retrieval has generated interest in recent research papers. It promotes more interesting clothing recommendation system and gives the possibility of improving identity or occupation recognition. In this paper, we present a brand-new video-based clothing retrieval system. We believe the system gives another intuitive clothing recommendation interface in a smart home with such an application scenario: one can select an impressive shot where the character is wearing a fascinating clothing by a TV remote control, and learn the clothing style from the character. However, there still are major challenges in this topic, such as human pose estimation and complex background between online shopping datasets, which often cause inaccurate retrieval results. Our research focuses on two issues here. First, we propose a human pose estimation mechanism with a video clip of frames for the refinement of inaccurate human pose. Second, we explore an automatic foreground segmentation method with "Grabcut" algorithm to tackle the complex background problem. In our experiments, we collect a few video clips and different kinds of online shopping datasets. The experimental results successfully demonstrate that our mechanism will improve the inaccurate pose estimation and can tackle the complex background problem.Contents 誌謝i 中文摘要ii Abstract iii Contents iv List of Figures vi List of Tables viii 1 Introduction 1 2 Related Work 3 2.1 Clothing retrieval . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 2.2 2D human pose estimation . . . . . . . . . . . . . . . . . . . . . . . . . 4 3 Framework 6 4 Temporal consistent pose estimation 8 4.1 Temporal consistent pose estimation . . . . . . . . . . . . . . . . . . . . 8 5 Clothing retrival 11 5.1 Automatic foreground segmentation via Grabcut . . . . . . . . . . . . . . 11 5.2 Foreground spatial statistic . . . . . . . . . . . . . . . . . . . . . . . . . 12 5.3 Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 iv 5.3.1 Color Moment . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 5.3.2 Color Histogram . . . . . . . . . . . . . . . . . . . . . . . . . . 13 5.3.3 Skin Descriptor . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 5.3.4 Local Binary Pattern . . . . . . . . . . . . . . . . . . . . . . . . 14 5.3.5 Histogram of Gradient . . . . . . . . . . . . . . . . . . . . . . . 14 6 Experiment 16 6.1 Datasets Construction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 6.1.1 Video Clips Collection . . . . . . . . . . . . . . . . . . . . . . . 16 6.1.2 Clothing Image Collection . . . . . . . . . . . . . . . . . . . . . 17 6.1.3 Clothing Attribute Labeling . . . . . . . . . . . . . . . . . . . . 17 6.2 Experimental result . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 6.2.1 Improved Pose Estimation . . . . . . . . . . . . . . . . . . . . . 18 6.2.2 Evaluation Criterion . . . . . . . . . . . . . . . . . . . . . . . . 18 6.2.3 Performance of Automatic foreground segmentation and Foreground statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 6.2.4 Performance of Different Features On Different Attribute . . . . . 19 6.2.5 Video-based v.s Image-based . . . . . . . . . . . . . . . . . . . . 20 7 Conclusion 28 Bibliography 304806239 bytesapplication/pdf論文公開時間：2014/01/27論文使用權限：同意有償授權(權利金給回饋本人)前景切割人體姿勢偵測基於影像資訊服裝檢索基於影片資訊之衣服檢索系統Video-based Clothing Retrievalthesishttp://ntur.lib.ntu.edu.tw/bitstream/246246/263455/1/ntu-102-R00944038-1.pdf