An Integrated Framework for Recognizing Highly Imbalanced Bilingual Code-switched Lectures with Cross-language Acoustic Modeling and Frame-level Language Identification
Date Issued
2015
Date
2015
Author(s)
Yeh, Ching-Feng
Abstract
This thesis considers the recognition of a widely observed type of bilingual code-switched speech: the speaker speaks primarily the host language (usually his native language), but with a few words or phrases in the guest language (usually his second language) inserted in many utterances of the host language. In this case, not only the languages are switched back and forth within an utterance so the language identification is difficult, but much less data are available for the guest language, which results in poor recognition accuracy for the guest language part. In this thesis, we propose an integrated overall framework for recognizing such highly imbalanced code-switched speech. This includes unit merging approaches on three levels of acoustic modeling (triphone models, HMM states and Gaussians) for cross-lingual data sharing, unit recovery for reconstructing the identity for units of the two languages after being merged, unit occupancy ranking to offer much more flexible data sharing between units both across languages and within the language based on the accumulated occupancy of the HMM states, and estimation of frame-level language posteriors using Blurred Posteriorgram Features (BPFs) to be used in decoding. In addition, we also evaluated two approaches extending above approaches based on HMMs to the state-of-the-art deep neural networks (DNNs), including using bottleneck features in HMM/GMM and modeling context-dependent HMM states. We present a complete set of experimental results comparing all approaches involved for a real-world application scenario under unified conditions, and show very good improvement achieved with the proposed approaches.
Subjects
Speech Recognition
Code-switching
Acoustic Modeling
Language Identification
Type
thesis
File(s)![Thumbnail Image]()
Loading...
Name
ntu-104-D00942013-1.pdf
Size
23.32 KB
Format
Adobe PDF
Checksum
(MD5):1d10c19111689ac1fc82113f720e0a2d
