Modified lasso screening for audio word-based music classification using large-scale dictionary
Journal
ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
ISBN
9781479928927
Date Issued
2014-01-01
Author(s)
Abstract
Representing music information using audio codewords has led to state-of-the-art performance on various music classifcation benchmarks. Comparing to conventional audio descriptors, audio words offer greater fexibility in capturing the nuance of music signals, in that each codeword can be viewed as a quantization of the music universe and that the quantization goes fner as the size of the dictionary (i.e., audio codebook) increases. In practice, however, the high computational cost of codeword assignment might discourage the use of a large dictionary. This paper presents two modifcations of a LASSO screening technique developed in the compressive sensing feld to speed up the codeword assignment process. The frst modifcation exploits the repetitive nature of music signals, whereas the second one relaxes a screening constraint that is specifc to reconstruction but not for classifcation. Our experiments show that the proposed method enables the use of a dictionary of 10,000 codewords with runtime close to the case of using a dictionary of 1,000 codewords. Moreover, using the larger dictionary signifcantly improves the mean average precision (MAP) from 0.219 to 0.246 for tagging thousands of tracks with 147 possible genre tags. © 2014 IEEE.
Subjects
feature learning | genre classifcation | LASSO screening | music information retrieval | Sparse coding
Type
conference paper
