Subspace-based phonotactic language recognition using multivariate dynamic linear models

Lee, H.-S.; Shih, Y.-C.; Wang, H.-M.; SHYH-KANG JENG; Lee, H.-S.;Shih, Y.-C.;Wang, H.-M.;Jeng, S.-K.

doi:10.1109/ICASSP.2013.6638993

Subspace-based phonotactic language recognition using multivariate dynamic linear models

Journal

ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

Pages

6870-6874

Date Issued

2013

Author(s)

Lee, H.-S.

Shih, Y.-C.

Wang, H.-M.

SHYH-KANG JENG

DOI

10.1109/ICASSP.2013.6638993

URI

https://scholars.lib.ntu.edu.tw/handle/123456789/499913

https://www.scopus.com/inward/record.uri?eid=2-s2.0-84890499521&doi=10.1109%2fICASSP.2013.6638993&partnerID=40&md5=97ddcbab5d5c35052b79d41c858e9efb

Abstract

Phonotactics, dealing with permissible phone patterns and their frequencies of occurrence in a specific language, is acknowledged to be related to spoken language recognition (SLR). With the assistance of phone recognizers, each speech utterance can be decoded into an ordered sequence of phone vectors filled with likelihood scores contributed by all possible phone models. In this paper, we propose a novel approach to dig the concealed phonotactic structure out of the phone-likelihood vectors through a kind of multivariate time series analysis: dynamic linear models (DLM). In these models, treating the generation of phone patterns in each utterance as a dynamic system, the relationship between adjacent vectors is linearly and time-invariantly modeled, and unobserved states are introduced to capture a temporal coherence intrinsic in the system. Each utterance expressed by the DLM is further transformed into a fixed-dimensional linear subspace so that well-developed distance measures between two subspaces can be applied to linear discriminant analysis (LDA) in a dissimilarity-based fashion. The results of SLR experiments on the OGI-TS corpus demonstrate that the proposed framework outperforms the well-known vector space modeling (VSM)-based methods and achieves comparable performance to our previous subspace-based method. © 2013 IEEE.

Subjects

phonotactic language recognition

SDGs

[SDGs]SDG10

Other Subjects

Dynamic linear model; Language recognition; Linear discriminant analysis; Multivariate time series analysis; Specific languages; Spoken language recognition; Subspace based methods; Vector space models; Lagrange multipliers; Signal processing; Speech recognition; Telephone sets; Telephone systems; Time series analysis; Vectors

Type

conference paper

Subspace-based phonotactic language recognition using multivariate dynamic linear models

關於 (About)

聯絡資訊 (Contact Us)

相關網站 (Useful Links)

關於開放取用 (Open Access, OA)

出版社期刊論文授權政策 (Copyright)

使用說明 (Instructions)

登入說明 (Sign-in)

匯入著作 (Submission)