https://scholars.lib.ntu.edu.tw/handle/123456789/498582
Title: | Toward unsupervised model-based spoken term detection with spoken queries without annotated data | Authors: | Chan, C.-A. Chung, C.-T. Kuo, Y.-H. LIN-SHAN LEE |
Keywords: | query-by-example; speech pattern discovery; Unsupervised spoken term detection; zero-resource | Issue Date: | 2013 | Start page/Pages: | 8550 - 8554 | Source: | ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings | Conference: | 2013 38th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2013 | Abstract: | We present a two-stage model-based approach for unsupervised query-by-example spoken term detection (STD) without any annotated data. Compared to the prevailing DTW approaches for the unsupervised STD task, HMMs used by model-based approaches can better capture the signal distributions and time trajectories of speech with a more global view of the spoken archive; matching with model states also significantly reduces the computational load. The utterances in the spoken archive are first offline decoded into acoustic patterns automatically discovered in an unsupervised way from the spoken archive. In the first stage, we propose a document state matching (DSM) approach, where query frames are matched to the HMM state sequences for the spoken documents. In this process, a novel duration-constrained Viterbi (DC-Vite) algorithm is proposed to avoid unrealistic speaking rate distortion. In the second stage, pseudo relevant/irrelevant examples retrieved from the first stage are respectively used to construct query/anti-query HMMs. Each spoken term hypothesis is then rescored with the likelihood ratio to these two HMMs. Experimental results show an absolute 11.8% of mean average precision improvement with a more than 50% reduction in computation time compared to the segmental DTW approach on a Mandarin broadcast news corpus. © 2013 IEEE. |
URI: | https://scholars.lib.ntu.edu.tw/handle/123456789/498582 https://www.scopus.com/inward/record.uri?eid=2-s2.0-84890526441&doi=10.1109%2fICASSP.2013.6639334&partnerID=40&md5=cd1d9f75c9431093c16f8c89dec15998 |
ISSN: | 15206149 | DOI: | 10.1109/ICASSP.2013.6639334 | SDG/Keyword: | Model based approach; Precision improvement; Query-by-example; Signal distribution; Speech patterns; Spoken Term Detection (STD); Spoken term detections; zero-resource; Signal processing; Speech recognition; Viterbi algorithm; Query processing |
Appears in Collections: | 電機工程學系 |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.