Me-link: Link me to the media - Fusing audio and visual cues for robust and efficient mobile media interaction

Yeh C.-Y.Hsu Y.-M.Huang H.Jheng H.-W.Su Y.-C.Chiu T.-H.WINSTON HSU2019-07-102019-07-1020149781450327459https://scholars.lib.ntu.edu.tw/handle/123456789/412999In this demo, we present a scalable mobile video recognition system, named "Me-link," based on progressive fusion of light-weight audio visual features. With our system, users only have to point the mobile camera to the video they are interested in. The system will capture the frames and sounds, then retrieve relevant information immediately. As the users hold the mobile longer, the system progressively aggregates the cues temporally and then returns more accurate results. We also consider the real world noisy environment, where users may not get clear visual or audio signals. In the aggregation step of audio and visual cues, our system automatically detects the available channel for the final rank. On the server side, users can upload the videos with information via website. Besides, we also link the streaming signals so that users can get the real time broadcasting with "Me-link". ? Copyright 2014 by the International World Wide Web Conferences Steering Committee.Augmented reality; Mobile video recognition; Second screenMe-link: Link me to the media - Fusing audio and visual cues for robust and efficient mobile media interactionconference paper10.1145/2567948.25770182-s2.0-84990908128