A comparative study on the performance of several speech recognition techniques applied on the highly confusing mandarin syllables
Journal
Journal of the Chinese Institute of Engineers, Transactions of the Chinese Institute of Engineers,Series A/Chung-kuo Kung Ch'eng Hsuch K'an
Journal Volume
12
Journal Issue
6
Pages
705-713
Date Issued
1989
Author(s)
Abstract
In this paper, the performance of several speech recognition techniques applied on the highly confusing Mandarin syllables were carefully compared, including dynamic time warping (DTW), the newly proposed DTW with superimposed weighting function (DTWW), the discrete hidden Markov models (DHMM) and the continuous hidden Markov models (CHMM). The vocabulary used here consists of 409 first tone isolated Mandarin syllables. Due to the fact that many confusing sets exist in this vocabulary, the accurate recognition of these syllables is relatively difficult, and all the recognition experiments were performed in the speaker dependent mode. After a series of 13 experiments, it was found that the recognition rate of the newly proposed DTWW (88.3) is higher than that of DTW (85.1), DHMM (65.0) and CHMM (83.9), and that the CPU time used for DTWW is 1.03 times that for DTW, 24 times that for DHMM and 4.3 times that for CHMM. In addition, the memory space required for DTWW and DTW is 3.4 times that of DHMM and 8.5 times that of CHMM. Therefore, DTWW has the highest recognition rate, DHMM has the fastest recognition speed, whereas CHMM appears to be very attractive when all the different factors including recognition rate, recognition speed and memory space requirement are considered. © 1989 Taylor & Francis Group, LLC.
Subjects
Dynamic time warping; Hidden Markov model; Isolated word speech recognition
Other Subjects
Probability--Random Processes; Dynamic Time Warping; Mandarin Syllables; Markov Models; Recognition Rate; Recognition Speed; Superimposed Weighting Function; Speech
Type
journal article