應用決策樹及K最近鄰居法於語者辨識之研究

指導教授：陳國在臺灣大學：工程科學及海洋工程學研究所李旻曄Lee, Min-yeahMin-yeahLee2014-11-252018-06-282014-11-252018-06-282014http://ntur.lib.ntu.edu.tw//handle/246246/260963近年來隨著科技的進步，智慧型手機及平板電腦等3C產品已遍及我們的生活領域中，但同時也代表著以往用密碼確認身分的方式已不再是一種安全的方法。而生物辨識的技術也是隨著科技的進步而快速成長，在所有的生物特徵中，聲音算是我們人類身上最重要的生物特徵之一，也意味著可以用聲紋(Voiceprint)來作為語者辨識的一種依據。當今在分類與辨識方面的演算法最常被研究者所使用的，便是機器學習演算法，本文所採取的機器演算法中的K 最近鄰居法(K-Nearest Neighbor , KNN)，其優勢在於分類能力以及預測方面都有不錯的表現，只要將參數設定良好，得到辨識能力應可達到不錯的成果。雖然K 最近鄰居法在語者辨識方面有不錯的效果，但若能在分類群方面，預先分配好受訓資料群和測試資料群，及可提升其辨識效果，所以本文再採用決策樹(Decision Tree)，先將錄製語句中的測試語句和受訓語句進行分類，待步驟結束後，再將其帶入KNN演算法執行，得到辨識數據的顯示確有所提升。In recent years, getting through highly developing of technology, 3C products like smart phones and tablets had been spread around our daily life. Whatsoever, the situation as described means it will not be secure to identify someone by using ones password. Meanwhile like that of developing technology, the biology identification technique had been also largely improved. Moreover, among all features of creatures, voice is regarded as one of the most important characteristics, which means we can adopt voiceprint as an efficient tool to recognize people. Regarding to the algorithms, machine-learning has been mostly used by the studies in the aspect of categorization and identification. The K-Nearest Neighbor, KNN adopted in this study has advantages of the ability to well categorize and predict, thus, it is also common in studies of voice identification. Once we set parameters well, it can lead us to a good outcome by its ability of identification. Though K-Nearest Neighbor and KNN are working quite well in the aspect of voice identification, however, if we distribute train data and test data beforehand, the effect of identification can be much improved. Consequently, decision tree was adopted in the study to categorize train data and test data in recorded sentence, and then combine the data and execute with KNN. The final outcome, as expected, has been improved.目錄摘要 1 Abstract 2 目錄 3 圖目錄 5 表目錄 5 第一章緒論 7 1.1 前言 7 1.2 研究背景 7 1.3 研究動機 9 1.4 文獻回顧 10 1.5 論文章節概要 11 第二章語音訊號特徵參數之計算 13 2.1 前言 13 2.2 語音訊號前處理 13 2.2.1 語音訊號數位取樣 13 2.2.2 正規化(Normalize) 15 2.2.3 預強調(Pre-Emphasis) 17 2.2.4 取音框(Frame) 19 2.2.5 視窗化(Window) 21 2.2.6 端點偵測 (End-Point Detection) 23 2.2.7 快速傅立葉轉換 (Fast Fourier Transform,FFT) 25 2.3 語音特徵參數擷取 26 2.3.1 倒頻譜(Cepstrum) 26 2.3.2 梅爾頻率 28 2.3.3 三角帶通濾波器 29 第三章語者辨識演算法 33 3.1 機器學習 33 3.2 決策樹簡介 34 3.2.1 C4.5 決策樹 36 3.2.2 CHAID(Chi-square Automatic Interaction Detection) 37 3.2.3 CART(Classification and Regression Tree) 37 3.3 K最近鄰居法 38 圖 3. 4語音距離示意圖第四章實驗所用之語料及軟體 41 4.1 實驗語料 42 4.2 實驗所用之軟體 42 第五章實驗數據比較及系統架構 43 5.1實驗流程 43 5.2 實驗說明以及討論 44 5.3 結合法一(使用KNN演算法去做語者辨識) 45 5.3.1 前處理部分 45 5.3.2 訓練部分 45 5.3.3 分類部分 45 5.4 結合法二(結合KNN演算法與決策樹執行語者辨識) 48 5.4.1 前處理部分 48 5.4.2 訓練部分 48 5.4.3 分類部分 48 5.5 雜訊影響 52 5.5.1 討論雜訊對不同訓練模式的影響 53 5.5.2 乾淨訓練模式 53 5.5.3 乾淨訓練模式的結果比較 54 5.5.4 複合訓練模式 55 5.5.5 複合訓練模式的結果比較 56 5.6 實驗結果總彚 57 5.7 KNN與SVM的語者辨識比較 59 第六章結論與未來展望 61 6.1結論 61 6.2未來展望 61 參考文獻 631238761 bytesapplication/pdf論文使用權限：不同意授權K最近鄰居法決策樹梅爾倒頻譜係數語者辨識應用決策樹及K最近鄰居法於語者辨識之研究Study on Speaker Recognition Using Decision Tree And K-Nearest Neighborthesishttp://ntur.lib.ntu.edu.tw/bitstream/246246/260963/1/ntu-103-R01525068-1.pdf