Efficient Standard Audio Format Conversion and Music Retrieval System
Date Issued
2005
Date
2005
Author(s)
Yeh, Fang-Chun
DOI
en-US
Abstract
In the last few years, digital multimedia storage and searching technique become more and more important as personal computer facilities progressing rapidly. For this reason, several digital music formats were published in succession. Because of the difference between music formats, sampling rate conversion (SRC) becomes an important process when we want to convert from one music format into another. Furthermore, due to the popularity of broadband network, multimedia database searching is also a technique eager to be developed. In order to provide users a convenient and intuitional searching engine, content-based music information retrieval system is always an interesting topic.
There are two subjects covered in this thesis. In the first subject, we want to introduce an efficient algorithm for standard audio format conversion. Note that the sampling rate conversion process entirely depends on the scaling factor. In conventional multi-rate system, scaling factor must be a rational number. When we want to convert form one audio format into another, we should simplify the sampling frequency ratio to the minimum integer and perform interpolation and decimation. As to the irrational scaling factor, there is no steadfast definition for a long time. In this thesis, we linearly operate the signal in the frequency domain and implement irrational scaling process for audio format conversion efficiently.
In the second subject, we want to introduce a different content-based music retrieval system. The system takes humming melody as query input, and use Fourier of Fourier transform algorithm for pitch tracking. In matching scheme, we use the first note based pitch contour as the matching feature and plus an off-key adjustment. Combine these two methods, the system can be greatly speeded up and the recall rate can also be improved. The music database includes 96 music so far, and we use the phrase query mechanism so that users can hum a query starting from any phrase within a song. Thus there are about 745 phrases in out database. After applying first note based pitch contour and off-key adjustment to the matching features, the required searching time is about 1.9 second for an 8 seconds humming query. And for 500 test humming inputs, the top 10 recall rate is about 74.60%.
There are two subjects covered in this thesis. In the first subject, we want to introduce an efficient algorithm for standard audio format conversion. Note that the sampling rate conversion process entirely depends on the scaling factor. In conventional multi-rate system, scaling factor must be a rational number. When we want to convert form one audio format into another, we should simplify the sampling frequency ratio to the minimum integer and perform interpolation and decimation. As to the irrational scaling factor, there is no steadfast definition for a long time. In this thesis, we linearly operate the signal in the frequency domain and implement irrational scaling process for audio format conversion efficiently.
In the second subject, we want to introduce a different content-based music retrieval system. The system takes humming melody as query input, and use Fourier of Fourier transform algorithm for pitch tracking. In matching scheme, we use the first note based pitch contour as the matching feature and plus an off-key adjustment. Combine these two methods, the system can be greatly speeded up and the recall rate can also be improved. The music database includes 96 music so far, and we use the phrase query mechanism so that users can hum a query starting from any phrase within a song. Thus there are about 745 phrases in out database. After applying first note based pitch contour and off-key adjustment to the matching features, the required searching time is about 1.9 second for an 8 seconds humming query. And for 500 test humming inputs, the top 10 recall rate is about 74.60%.
Subjects
音樂格式轉換
無理倍數升頻
無理倍數降頻
內涵式音樂搜尋系統
兩次傅利葉轉換
走音修正
music format conversion
irrational decomation
irrational interpolation
cobtebt based music retrieval system
pitch
Fourier of Fourier transform
off-key adjustment
Type
thesis