複音音樂之音高辨識

鄭士康臺灣大學:電機工程學研究所王炳堯Wang, Ping-YaoPing-YaoWang2010-07-012018-07-062010-07-012018-07-062008U0001-2907200813463500http://ntur.lib.ntu.edu.tw//handle/246246/187935音樂自動採譜近來在電腦音樂領域是一個熱門的研究主題。然而因不同音高的訊號混合受到了相位差的影響，因此難以正確使用簡單方法來正確辨識複音音樂的音高。在這篇論文中，我們提出了一個解決複音音樂之音高辨識問題的方法。我們專注於將不同音高訊號中相同頻率的泛音成份從輸入訊號中分離出來。藉由預先建立的可提供合理泛音成份組成參考的樂器音色參數機率模型，我們使用全域最佳化方法來找出最符合輸入訊號的最佳參數，藉以得到音樂訊號中的音高。從評估我們提出的方法以及其他方法的過程，可以證明我們所提出方法在準確率以及強固性所帶來的進步。Music transcription is a popular research topic recently. However, estimating pitch in polyphonic music signal encounters difficulties since the signal is a mixture of waveforms from all notes with phase differences, and estimation errors can easily arise when simple greedy methods are used. In this thesis, we propose a method to solve the problem of estimating the pitches in polyphonic music. We try to focus on separating the harmonic components of the same frequency from different notes from the observed mixtures in the music signal. With the pre-built probabilistic model of instrument timbre, which provides a reference for the reasonable ratio of each harmonic component in a pitched note, we use global optimization method to estimate optimal parameters to separate each note from the music signal. Two types of evaluation, including pitch estimation on note combinations of different intervals and pitch estimation on short music pieces, was done on the proposed system and other methods, which shows the performance and robustness of the proposed method.口試審定書 i謝 iiibstract v要 viiontents ixist of Figures xiist of Tables xiiihapter 1 Introduction 1.1 Background 1.2 Previous works 1.3 Goal of this thesis 3hapter 2 Discrete-Time Signal Analysis in Frequency Domain 5.1 Short-Time Fourier Transform 5.2 Constant Q Transform 7hapter 3 Problem and Our Method 9.1 Assumption on the signal to be estimated 9.2 Definition of Parameters 9.3 Estimation of optimal parameters 11.4 Probabilistic model and likelihood of timbre parameters for instrument 12.5 The evaluation function 13hapter 4 Building Probabilistic Model of Instrument Timbre 15.1 Probability Distribution for normalized parameters 15.1.1 Beta Distribution 15.1.2 Dirichlet Distribution 17.2 Maximum Likelihood Estimation of the Model Parameter 18.3 Maximum Likelihood Estimation of the Probability Distributions 20hapter 5 Global Optimization by Adaptive Simulated Annealing 21.1 Why Global Optimization and Adaptive Simulated Annealing? 21.2 ASA Algorithm Description 21.2.1 Generating probability density function 22.2.2 Acceptance Function 23.3 Algorithm Detail 24.4 Algorithm parameter tuning and convergence 25hapter 6 System Architecture 27.1 System Organization 27.2 Spectrum Extractor 28.3 Pitch Estimator 28.4 Timbre Model Builder 29.5 Post Processor 29hapter 7 System Evaluation and Comparison 31.1 Samples used for evaluation 31.2 Design of the evaluation process 32.2.1 Evaluation in estimating pitches of combinations of different interval of notes 32.2.2 Evaluation in estimating pitches in short music pieces 32.3 Evaluation Metrics in estimating pitches of combinations of different interval of notes 33.3.1 Rank Sum Difference of Correct Pitches (RSDCP) 33.3.2 False to Correct Ratio (FCR) 34.4 Evaluation Metrics in estimating pitches in short music pieces 34.5 Evaluation parameters used in different algorithms 35.5.1 Parameters used in proposed method 35.5.2 Parameters used in Spectrum Subtraction method 36.6 Evaluation Results 36.6.1 Evaluation Result of estimating pitches of combinations of different intervals of notes 37.6.2 Evaluation Result of estimating small pieces of music 41.7 Summary of the evaluation 48.8 Further Experiments and Discussions 49.8.1 Maximum Number of Polyphony 49hapter 8 Conclusions 51eferences 533835539 bytesapplication/pdfen-US複音音樂音高辨識自動採譜標型辨識全域最佳化Polyphonic MusicPitch EstimationAutomatic Music TranscriptionPattern RecognitionGlobal Optimization複音音樂之音高辨識Pitch Detection for Polyphonic Musicthesishttp://ntur.lib.ntu.edu.tw/bitstream/246246/187935/1/ntu-97-R95921042-1.pdf