臺灣大學: 電機工程學研究所陳銘憲蕭人豪Hsiao, Jen-HaoJen-HaoHsiao2013-03-272018-07-062013-03-272018-07-062010http://ntur.lib.ntu.edu.tw//handle/246246/254112隨著數位科技如數位相機及手機內嵌相機的高度發展,多媒體圖像及影音資料快速成長及累積。再加上網際網路的普及,一般大眾所能接觸與存取的視覺化資訊變成非常的巨大。如何有效的在這個大型的視覺資訊資料庫中有效的進行搜尋與存取也因此變成多媒體領域中的一個熱門研究議題。 本論文主要目標是達成在影像資料庫中的視覺資訊搜索。複製影像偵測及影像物件搜尋這二種在不同應用領域的視覺資訊搜索技術將在本論文中進行探討。除了基本的搜尋議題之外,我們也研究了排序改良技術,其功能在於利用已知或虛擬標籤樣本來改善現有的排序函數。其中我們特別針對從不同而互補的二個資訊來源中學習出新的查詢模型這個議題進行探討,在此二個資訊來源分別為基礎排序器及使用者回饋所提供的資訊。 在本論文中,我們首先提出延伸特徵集方法來進行複製影像偵測。不像傳統方法直接著眼於解決困難且領域相依性高的特徵選取問題,我們所提出的延伸特徵集方法利用虛擬事前攻擊來解決複製影像偵測問題。這項技術可以自動地產生一組對各式影像攻擊都具有高抵抗力的影像特徵,以提升複製影像偵測的準確性。除此之外,延伸特徵集方法可以輕易的整合到目前現存的複製影像偵測器中,並強化其原始偵測能力。 其次我們提出了一個結合語言模型與虛擬回饋機制的方法來解決以詞彙袋架構為基礎的影像物件搜尋系統中的語彙問題。我們利用由原始查詢而得來的虛擬正相關影像中的線索來逐步改良最初的查詢語言模型。不像傳統方法只單純的把回饋資訊與視覺線索直覺式的強加到查詢語言模型中,我們所提出的方法以更精緻的方式來重建查詢語言模型,進而達成對查詢主題更精準的對焦。 最後,我們描述了一個意圖專注的主動重排序技術,這個方法可以從使用者回饋資訊中主動地找出使用者真正感興趣的視覺資訊以進行查詢模型重新估計。本方法利用三個新的策略來改善基礎排序器 (即事先給定的一個排序函數) 的效能。策略一為主動選擇技術,該技術能自動選取出可以提供給基礎排序器最多資訊的一小組影像給使用者進行標籤加註動作。策略二為使用者意圖驗證,該技術可以在物件層級的精準度捉取出使用者的意圖焦點,也因此可以減緩模型重新估計而造成的查詢偏移問題。策略三為有鑑別力的查詢模型重估計,該技術為傳統的生成式模型導入包含正、負回饋資訊的鑑別資訊,以提升模型重估後的準確率。 本論文中所提出的方法實際在真實世界的影像資料庫中進行測試。實驗結果顯示延伸特徵集方法可以在本質上提升複製影像的偵測準確率,而重新排序技術則能帶來顯著的影像物件檢索準確率的改善,並且達到比傳統資訊回饋方法更好的重排序結果。The development of technology such as digital cameras and mobile telephones equipped with digital imaging sensors has generated a huge amount of multimedia data such as images and videos. With the world-wide spread of the Internet, the amount of easily accessible visual information that an ordinary people can reach has become so vast. The topic of efficacious access and retrieval of visual information has thus become a very active research topic in multimedia community. The main goal of this dissertation is to enable visual search of images in a large image collection. Two different types of visual information search, near-duplicate image detection and image object retrieval, are explored for different application fields. In addition to the fundamental search issues, we also study the problem of ranking refinement, whose goal is to improve an existing ranking function by a set of labeled or pseudo-relevant instances. We are, particularly, interested in learning a better query model using two complementary sources of information: the information from the base ranker (i.e., the existing ranking function) and the information from users’ feedbacks. In this dissertation, we first present a new framework called the extended feature set (EFS) for detecting copies of images. Instead of dealing directly with the feature selection problem, which is hard to solve and domain dependent, the proposed EFS framework addresses the copy detection problem by using prior simulated attacks. This technique enhances the detection accuracy by generating features with the necessary invariance to resist various types of image manipulation. Furthermore, the proposed approach can be integrated into existing copy detectors to further improve their performance. We then present a novel language-model-based approach with pseudo-relevant feedback to address the vocabulary problem in the visual bag-of-words-based (VBOW-based) search, which is one leading method for image object retrieval. We employ the pseudo positive images produced in response to the original query as a set of “cues” to gradually refine the query language model. Unlike traditional approaches that only ruggedly append feedback information into the original query, the proposed approach reconstructs the query language model with finer granularities so that the query concepts can be captured more accurately. Finally, we describe the Intention-Focused Active Reranking, an approach for automatically finding the right information from user’s labeled data to re-estimate the query model under the active feedback framework. Three novel strategies are proposed to boost the performance of the base ranker (i.e., a given ranking function): (1) an active selection criterion, which obtains a small number of feedback images that are the most informative to the base ranker for user labeling; (2) the user intention verification, which captures the user’s intention in object level to alleviate the query drift problem; (3) a discriminative query model re-estimation, which augments the generative approach with a model of the discriminative information conveyed by positive and negative feedback information. The proposed approaches are experimentally evaluated using real world image data sets. Experiment results demonstrate that the proposed EFS approach can substantially enhance the accuracy of copy detection, and the proposed ranking refinement algorithms can bring significant improvement in the image object retrieval accuracy over a non-feedback baseline, and achieve better performance than conventional feedback approaches.1718071 bytesapplication/pdfen-US複製影像偵測影像物件檢索相關性回饋Copy detectionimage object retrievalrelevance feedback視覺資訊搜尋與排序優化技術之研究A Study on the Visual Information Search and Ranking Refinementthesishttp://ntur.lib.ntu.edu.tw/bitstream/246246/254112/1/ntu-99-D93921019-1.pdf