The acousticvisual emotion guassians model for automatic generation of music video
Journal
MM 2012 - Proceedings of the 20th ACM International Conference on Multimedia
ISBN
9781450310895
Date Issued
2012-12-26
Author(s)
Abstract
This paper presents a novel content-based system that utilizes the perceived emotion of multimedia content as a bridge to connect music and video. Specifically, we propose a novel machine learning framework, called Acousticvisual Emotion Guassians (AVEG), to jointly learn the tripartite relationship among music, video, and emotion from an emotion-annotated corpus of music videos. For a music piece (or a video sequence), the AVEG model is applied to predict its emotion distribution in a stochastic emotion space from the corresponding low-level acoustic (resp. visual) features. Finally, music and video are matched by measuring the similarity between the two corresponding emotion distributions, based on a distance measure such as KL divergence. © 2012 Authors.
Subjects
cross-modal media retrieval | emotion recognition
Type
conference paper
