應用節奏與頻率資訊之改良式哼唱檢索系統及改良式發端偵測與旋律匹配

電機資訊學院: 電信工程學研究所指導教授: 丁建均林巧薇Lin, Chiao-WeiChiao-WeiLin2017-03-062018-07-052017-03-062018-07-052016http://ntur.lib.ntu.edu.tw//handle/246246/276029近年來語音系統運用十分廣泛，透過從聲音訊號中分析、擷取特徵，進而到合成壓縮等。應用在哼唱檢索系統，可以不必透過傳統的歌名、歌手或歌詞等已知文字資訊，而改以歌曲的旋律節奏資訊搜尋。在現今音樂資料庫龐大的情況下，此搜尋系統應在短時間內結束搜尋，因此需要兼顧搜尋效率及效果。一個哼唱檢索系統主要包含三部份：輸入端、資料庫與系統演算法。輸入端是系統所接收的輸入，在此系統中通常是單純的歌聲；資料庫建立可比對的歌曲資料，提供可能的檢索曲目；系統演算法需先擷取輸入端特徵，並將之轉換為可比對的格式，如音高及音符長度，再運用比對演算法比對輸入序列與資料庫。哼唱檢索系統中利用比對演算法逐一比對輸入信號與資料庫中的歌曲，並依照相似分數排序給出可能的歌曲清單。然而大部分的人無法完全依照原曲唱出正確的旋律，且每個人的唱腔也有所不同，因此一個好的系統應能盡所能的應付這些可能發生的狀況。一般而言，比對方式大致分為二種：音符導向或音框導向，前者不論資料庫及輸入訊號都以音符為單位，其優勢是效率較高；後者以音框為單位，在比對上一般而言較為有效。在這篇論文中，我們使用音符導向的比對系統，提出一個發端辨識的改進方法，也改進旋律比對方法來改善哼唱比對系統。此外，我們也使用自己提出的方法做音高偵測。在未來研究的方向上，希望能更進一步改善哼唱檢索的效率與效能，特別是在現實世界中，資料庫中歌曲的數量事相當龐大的，因此更要求運算速度。在發端辨識上，雖然我們的方法已能達到不錯的效果，但仍有改進空間，希望未來研究能達到更好的準確率。In recent years, voice signal system has been used in a wide range. The techniques include voice signal analysis, feature extraction, voice synthesis and compression. Applying these techniques to the query by humming (QBH) system, we do not need to know the traditional concrete descriptions like song name, singer or lyrics. Instead, we can use the melody and tempo information to search a song. The music database nowadays is large and the search result should be revealed in a short time, so the system efficiency is as important as the effect. A QBH system includes three parts: input data, music database and matching algorithm. Input data of the QBH system is usually the humming or singing from human. The database is a collection of songs provided for search. The system first extract the features of input signal like pitch and length of notes, then use the matching algorithm to compare with the songs in database. The QBH system compares the input signal and database to list a ranking of possible songs by the matching score. However, people usually cannot sing perfectly just as the reference song. Also the singing style is different from person to person. A good QBH system should be able to deal with all possible problems for amateur singing. Generally speaking, the melody matching at least two types: the note-based and the frame-based method. The advantage of note-based system is its efficiency while the frame-based system is more effective. In this paper, we use the note-based method. We proposed an advanced onset detection and improved melody matching system to improve the QBH system. Besides, we use our own pitch estimation method to estimate the fundamental frequency. In our experiment, we show the fact that our proposed onset detection has the best performance than other methods. However, the future work should make an effort to improve the system efficiency and effect further since the database in the real world is huge.論文使用權限: 不同意授權哼唱檢索發端辨識音頻偵測旋律比對隱藏式馬可夫模型Query by hummingbeatonset detectionpitch estimationmelody matchinghidden Markov model應用節奏與頻率資訊之改良式哼唱檢索系統及改良式發端偵測與旋律匹配Improved Query by Humming System Using the Tempo and Frequency Information and Advanced Onset Detection and Melody Matching Methodsthesis10.6342/NTU201600831