Speaker Identification by the Consistency of Human Voice

Su,  Chiun-Fu

Speaker Identification by the Consistency of Human Voice

Date Issued

2010

Date

2010

Author(s)

Su, Chiun-Fu

URI

http://ntur.lib.ntu.edu.tw//handle/246246/254042

Abstract

In the study of speaker identification, timbre is often used as the characteristics of speakers. Timbre is the primary auditory feature that human verify the identities of speakers, and it is hidden inside harmonic components of a sound wave. Therefore, most of extracting speaker’s speech characteristics focus on the feature of frequency domain in the literature. Mel-Frequency Cepstral Coefficients (MFCC) and Linear Prediction Cepstral Coefficients (LPCC) are common methods of feature extraction, but their original purpose is the parameters of speech recognition so that the parameters vary with speech content, and limits identification performance. Thus, this thesis extend the idea [5] of the consistency of human voice to develop a method of feature extraction, and the method can find consistent feature vectors no matter what a speaker says. In this thesis, it is divided into two parts. First, using the idea that speaker features exist in high frequency bands promotes the consistency of feature vectors of describing timbre difference of two speakers. Second, the method of literature [5] is modified to investigate the consistency of individual timbre. In the second part, we use vocal tract model to obtain frequency responses of speech, and then use 22-order polynomial curve fitting to fit the frequency responses. Subsequently, normalized 23 coefficients are considered a 23-dimensional feature vector, and find that the feature vectors also have consistency. Finally, this method of feature extraction is used to perform the speaker identification, and achieve a good performance.

Subjects

Consistency of Human Voice

Feature Extraction

Speaker Identification

Vocal Tract Model

Polynomial Curve Fitting

Type

thesis

File(s)

Name

ntu-99-R97921045-1.pdf

Size

23.32 KB

Format

Adobe PDF

Checksum

(MD5):65adeedab54f1f9a8296d49e74bba63c

Speaker Identification by the Consistency of Human Voice

關於 (About)

聯絡資訊 (Contact Us)

相關網站 (Useful Links)

關於開放取用 (Open Access, OA)

出版社期刊論文授權政策 (Copyright)

使用說明 (Instructions)

登入說明 (Sign-in)

匯入著作 (Submission)