鄭士康臺灣大學:電信工程學研究所黃耀樟Huang, Yao-ChangYao-ChangHuang2007-11-272018-07-052007-11-272018-07-052004http://ntur.lib.ntu.edu.tw//handle/246246/58854本論文提出了建構在MPEG-7 Audio上的音樂檢索系統與音樂自動推薦系統。一開始作者介紹MPEG-7 Audio中,有關於音樂的描述部分(旋律Melody跟聲紋Audio Signature)。在音樂檢索系統的部分,其目標是從輸入的MIDI檔,來擷取出Melody Description Scheme所定義的MPEG-7格式,並從查詢的音樂片段中尋找相似的部分;在決定旋律線的過程中,利用最高音高法(The most highest pitch method)來克服存在旋律中的和絃(Chord)問題;並使用Local Alignment這個動態規劃(Dynamic Programming)的演算法,來進行相似度比較。在剖析MPEG-7的檔案上則是使用Xerces的XML剖析器(Parser)。最後系統會建造出一個評分表供使用者來檢驗系統效能。在音樂自動推薦系統上則是採取推薦演算法當中的內容過濾(Content-based Filtering)方法。對於使用者所偏好的音樂,抓出MPEG-7 Audio的聲紋,利用向量量化(Vector Quantization)來分群。一個新的歌進來,可以歸類在不同的群組,給定是否推薦的分數。在音樂檢索系統的部分,在1242首的MIDI歌曲中,頭一個找到相類似的歌曲,在旋律線長度為十的時候,辨識率可達六成五,之後辨識率會隨著旋律線長度增加而提升;推薦系統方面,則讓兩個人來對這571首wav跟MP3的歌曲打喜好的分數,再跟系統所提供的分數來做比較。最後發現平均推薦正確率可達七成五。In this thesis, the author proposes a music informational retrieval system and an audio recommendation system using MPEG-7 Audio. In the beginning, Melody and Audio Signature Description Scheme which are relevant to music are introduced. For music informational retrieval system, the goal is to retrieve the MPEG-7 Melody Description Scheme from the MIDI input and use them to find the music clips most similar to the queries from the data base. During the melodic contour extraction, the system applies the highest pitch method to overcome the chord problem existed in melody; local alignment, an algorithm in dynamic programming, is utilized to calculated the similarity. The Xerces C++ XML parser is adopted in the system in order to parse the MPEG-7 files. Finally, a rating result is constructed for system performance evaluation. On other hand, the audio recommendation system introduces the content-based filtering approach. Depending on the music data from the user’s preference, the system extracts the corresponding MPEG-7 audio signature and employs LBG Vector Quantization for classification. A new music input may be classified, and be given a score to decide whether the song is recommended. In the part of music information retrieval system, the recognition rate could be arrived at 65% when the first similar candidate at contour length 10 is found in 1242 MIDI files. Then the recognition rate will rise with the increase of contour lengths. In the audio recommendation system’s part, we let two people to give the ratings for 571 wav and MP3 songs according to their preference and some songs are all rated by the recommendation system. After comparing the rating by human and the system, we discovered that the average correct recommendation ratio could reach 75%.Chapter 1 Introduction 1.1 Motivation 1.2 Previous Works 1.3 Contribution 1.4 Chapter Outlines Chapter 2 Background 2.1 Overview 2.2 Music Content Based Retrieval 2.3 XML Overview 2.3.1 XML 2.3.2 XML Basic 2.4 MPEG-7 Audio Overview 2.4.1 Composition 2.4.2 Definitions 2.4.3 Implementation 2.5 Melody Description Scheme 2.5.1 MelodyType 2.5.2 MelodyContourType 2.5.3 MelodySequenceType 2.6 Audio Spectrum Flatness Descriptor (ASF D) 2.7 Audio Signature Description Scheme Chapter 3 Music Information System based on Melody Description Scheme 3.1 System Overview 3.2 XML Parser Application Interface 3.3 Preprocessing 3.3.1 Melody Extraction 3.2.2 Melody Standardization 3.3.3 Implementation in Melody Description Scheme 3.4 Pattern Matching 3.5 Experiments 3.5.1 Experiment Performance Evaluation 3.5.2 Test Collection and Query Set 3.5.3 Experiment Results Chapter 4 Audio Recommendation System Based on Audio Signature 4.1 Introduction 4.2 Audio Signature 4.3 System Architecture 4.3.1 The Feature Extraction Phase 4.3.2 The Classification Phase 4.3.3 The Rating Phase 4.4 Experiment Results 4.4.1 Experiment Design 4.4.2 Experiment Results Chapter 5 Discussions and Conclusions 5.1 MPEG-7 Audio Aspect 5.2 System Aspect 5.3 Summery Reference Appendix A : XML Schema594018 bytesapplication/pdfen-US內涵式檢索音樂資訊檢索推薦系統MPEG-7 Audio旋律聲紋content-based retrievalmusic information retriev建構在MPEG-7 Audio上的音樂檢索與推薦系統Music Information Retrieval and Audio Recommendation System Based on MPEG-7 Audiothesishttp://ntur.lib.ntu.edu.tw/bitstream/246246/58854/1/ntu-93-R91942037-1.pdf