Towards real-time music auto-tagging using sparse features
Journal
Proceedings - IEEE International Conference on Multimedia and Expo
ISBN
9781479900152
Date Issued
2013-10-21
Author(s)
Abstract
Unsupervised feature learning algorithms such as sparse coding and deep belief networks have been shown a viable alternative to hand-crafted feature design for music information retrieval. Nevertheless, such algorithms are usually computationally expensive. This paper investigates techniques to accelerate sparse feature extraction and music classification. To study the trade-off between computational efficiency and accuracy, we compare state-of-the-art, dense audio features with sparse features computed using 1) sparse coding with a random dictionary, 2) randomized clustering forest, and 3) an extension of randomized clustering forest to temporal signals. For classifier training and prediction, we compare support vector machines with linear or non-linear kernel functions. We conduct evaluation on music auto-tagging for 140 genre/style tags using a subset of 7,799 songs of the CAL10k data set. Our result leads to an 11-fold speed increase with 3.45% accuracy loss comparing to dense features. With the proposed sparse features, the feature extraction and auto-tagging operations can be finished in 1 second per song, with 0.1302 tagging accuracy in mean average precision. © 2013 IEEE.
Subjects
music auto-tagging | randomized clustering forest | sparse coding | Unsupervised feature learning
Type
conference paper