https://scholars.lib.ntu.edu.tw/handle/123456789/581320
標題: | Cross-Batch Reference Learning for Deep Retrieval | 作者: | Yang H.-F Lin K Chen T.-Y CHU-SONG CHEN |
關鍵字: | Deep learning; Gradient methods; Image retrieval; Optimization; Semantics; Stochastic systems; Different domains; Non-differentiable; Performance measure; Retrieval applications; Retrieval process; Semantic content; Stochastic gradient descent; Visual recognition; Classification (of information); article; deep learning; human; human experiment; image retrieval; stochastic model; visual memory | 公開日期: | 2020 | 卷: | 31 | 期: | 9 | 起(迄)頁: | 3145-3158 | 來源出版物: | IEEE Transactions on Neural Networks and Learning Systems | 摘要: | Learning effective representations that exhibit semantic content is crucial to image retrieval applications. Recent advances in deep learning have made significant improvements in performance on a number of visual recognition tasks. Studies have also revealed that visual features extracted from a deep network learned on a large-scale image data set (e.g., ImageNet) for classification are generic and perform well on new recognition tasks in different domains. Nevertheless, when applied to image retrieval, such deep representations do not attain performance as impressive as used for classification. This is mainly because the deep features are optimized for classification rather than for the desired retrieval task. We introduce the cross-batch reference (CBR), a novel training mechanism that enables the optimization of deep networks with a retrieval criterion. With the CBR, the networks leverage both the samples in a single minibatch and the samples in the others for weight updates, enhancing the stochastic gradient descent (SGD) training by enabling interbatch information passing. This interbatch communication is implemented as a cross-batch retrieval process in which the networks are trained to maximize the mean average precision (mAP) that is a popular performance measure in retrieval. Maximizing the cross-batch mAP is equivalent to centralizing the samples relevant to each other in the feature space and separating the samples irrelevant to each other. The learned features can discriminate between relevant and irrelevant samples and thus are suitable for retrieval. To circumvent the discrete, nondifferentiable mAP maximization, we derive an approximate, differentiable lower bound that can be easily optimized in deep networks. Furthermore, the mAP loss can be used alone or with a classification loss. Experiments on several data sets demonstrate that our CBR learning provides favorable performance, validating its effectiveness. ? 2012 IEEE. |
URI: | https://www.scopus.com/inward/record.uri?eid=2-s2.0-85090250917&doi=10.1109%2fTNNLS.2019.2936876&partnerID=40&md5=07459f62bcfef6f440868993f3eafda4 https://scholars.lib.ntu.edu.tw/handle/123456789/581320 |
ISSN: | 2162237X | DOI: | 10.1109/TNNLS.2019.2936876 |
顯示於: | 資訊工程學系 |
在 IR 系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。