Publication: On the preparation and validation of a large-scale dataset of singing transcription
Loading...
Date
2021
Authors
Wang J.-Y
Wang J.-Y;Jang J.-S.R.
JYH-SHING JANG
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
This paper proposes a large-scale dataset for singing transcription, along with some methods for fine-tuning and validating its contents. The dataset is named MIR-ST500, which consists of more than 160,000 notes from 500 pop songs. To create this large-scale dataset, we set some labeling criteria and ask non-experts to label notes. We also perform some adjustments on the annotation to correct minor errors. Finally, to validate the dataset, we train a singing transcription model on MIR-ST500 dataset and evaluate it on various datasets. The result shows that we can certainly construct a better singing transcription model for various purposes using MIR-ST500, which is properly labeled and validated. ? 2021 IEEE
Description
Keywords
Automatic singing transcription, Dataset preparation, Dataset validation, Music information retrieval, Signal processing, Fine tuning, Large-scale dataset, Large dataset