Power-scaled spectral flux and peak-valley group-delay methods for robust musical onset detection
Journal
Proceedings - 40th International Computer Music Conference, ICMC 2014 and 11th Sound and Music Computing Conference, SMC 2014 - Music Technology Meets Philosophy: From Digital Echos to Virtual Ethos
ISBN
9789604661374
Date Issued
2014-01-01
Author(s)
Su, Li
Abstract
A robust onset detection method has to deal with wide dynamic ranges and diverse transient behaviors prevalent in real-world music signals. This paper presents contributions to robust onset detection by proposing two novel onset detection methods. The first one, termed power-scaled spectral flux (PSSF), applies power scaling to the spectral flux to better balance the wide dynamic range in the spectrogram. The second method, called peak-valley groupdelay (PVGD), enhances the robustness to noise terms by detecting peak-valley pairs from the summed group-delay function to capture the attack-decay envelope. The proposed methods are evaluated on a piano dataset and a diverse dataset of 12 different Western and Turkish instruments. To tackle the problem from a fundamental signal processing perspective, in this study we do not consider advanced methods such as late fusion, multi-band processing, and neural networks. Experimental result shows that the proposed methods yield competitive accuracy for the two datasets, improving the F-score for the former dataset from 0.956 to 0.963, and the F-score for the latter dataset from 0.712 to 0.754, comparing to existing methods.
Type
conference paper
