|Title:||Image recall on image-text intertwined lifelogs||Authors:||Chu, Tzu Hsuan
Huang, Hen Hsen
Chen, Hsin Hsi
|Keywords:||Image retrieval | Lifelogging | Multimodal representation||Issue Date:||14-Oct-2019||Source:||Proceedings - 2019 IEEE/WIC/ACM International Conference on Web Intelligence, WI 2019||Abstract:||
© 2019 Association for Computing Machinery. People engage in lifelogging by taking photos with cameras and cellphones anytime anywhere and share the photos, intertwined with captions or descriptions, on social media platforms. The image-text intertwined data provides richer information for image recall. When images cannot keep the complete information, the textual information is a complement to describe the life experiences under the photos. This work proposes a multimodal retrieval model for image recall in image-text intertwined lifelogs. Our Attentive Image-Story model combines an Image model, which transfers visual information and textual information to a single representation space, and a Story model, which captures text-based contextual information, with an attention mechanism to reduce the semantic gap between visual and textual information. Experimental results show our model outperforms a state-of-the-art image-based retrieval model and the image/text hybrid system.
|Appears in Collections:||資訊工程學系|
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.