Bilingual analysis of song lyrics and audio words
Journal
MM 2012 - Proceedings of the 20th ACM International Conference on Multimedia
ISBN
9781450310895
Date Issued
2012-12-26
Author(s)
Abstract
Thanks to the development of music audio analysis, state-of-the-art techniques can now detect musical attributes such as timbre, rhythm, and pitch with certain level of reliability and effectiveness. An emerging body of research has begun to model the high-level perceptual properties of music listening, including the mood and the preferable listening context of a music piece. Towards this goal, we propose a novel text-like feature representation that encodes the rich and time-varying information of music using a composite of features extracted from the song lyrics and audio signals. In particular, we investigate dictionary learning algorithms to optimize the generation of local feature descriptors and also probabilistic topic models to group semantically relevant text and audio words. This text-like representation leads to significant improvement in automatic mood classification over conventional audio features. © 2012 ACM.
Subjects
audioword | LDA | MLLDA | sparse coding | topic model
SDGs
Type
conference paper
