https://scholars.lib.ntu.edu.tw/handle/123456789/632334
標題: | Weakly-supervised video re-localization with multiscale attention model | 作者: | Huang Y.-H Hsu K.-J SHYH-KANG JENG Lin Y.-Y. |
公開日期: | 2020 | 起(迄)頁: | 11077-11084 | 來源出版物: | AAAI 2020 - 34th AAAI Conference on Artificial Intelligence | 摘要: | Video re-localization aims to localize a sub-sequence, called target segment, in an untrimmed reference video that is similar to a given query video. In this work, we propose an attention-based model to accomplish this task in a weakly supervised setting. Namely, we derive our CNN-based model without using the annotated locations of the target segments in reference videos. Our model contains three modules. First, it employs a pre-trained C3D network for feature extraction. Second, we design an attention mechanism to extract multiscale temporal features, which are then used to estimate the similarity between the query video and a reference video.Third, a localization layer detects where the target segment is in the reference video by determining whether each frame in the reference video is consistent with the query video. The resultant CNN model is derived based on the proposed coattention loss which discriminatively separates the target segment from the reference video. This loss maximizes the similarity between the query video and the target segment while minimizing the similarity between the target segment and the rest of the reference video. Our model can be modified to fully supervised re-localization. Our method is evaluated on a public dataset and achieves the state-of-the-art performance under both weakly supervised and fully supervised settings. Copyright © 2020, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved. |
URI: | https://www.scopus.com/inward/record.uri?eid=2-s2.0-85106420780&partnerID=40&md5=426267e089432e46134002f9c667b948 https://scholars.lib.ntu.edu.tw/handle/123456789/632334 |
SDG/關鍵字: | Feature extraction; Attention mechanisms; Attention model; CNN models; Public dataset; Query video; Re-localization; State-of-the-art performance; Temporal features; Artificial intelligence |
顯示於: | 電機工程學系 |
在 IR 系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。