https://scholars.lib.ntu.edu.tw/handle/123456789/498582
標題: | Toward unsupervised model-based spoken term detection with spoken queries without annotated data | 作者: | Chan, C.-A. Chung, C.-T. Kuo, Y.-H. LIN-SHAN LEE |
關鍵字: | query-by-example; speech pattern discovery; Unsupervised spoken term detection; zero-resource | 公開日期: | 2013 | 起(迄)頁: | 8550 - 8554 | 來源出版物: | ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings | 會議論文: | 2013 38th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2013 | 摘要: | We present a two-stage model-based approach for unsupervised query-by-example spoken term detection (STD) without any annotated data. Compared to the prevailing DTW approaches for the unsupervised STD task, HMMs used by model-based approaches can better capture the signal distributions and time trajectories of speech with a more global view of the spoken archive; matching with model states also significantly reduces the computational load. The utterances in the spoken archive are first offline decoded into acoustic patterns automatically discovered in an unsupervised way from the spoken archive. In the first stage, we propose a document state matching (DSM) approach, where query frames are matched to the HMM state sequences for the spoken documents. In this process, a novel duration-constrained Viterbi (DC-Vite) algorithm is proposed to avoid unrealistic speaking rate distortion. In the second stage, pseudo relevant/irrelevant examples retrieved from the first stage are respectively used to construct query/anti-query HMMs. Each spoken term hypothesis is then rescored with the likelihood ratio to these two HMMs. Experimental results show an absolute 11.8% of mean average precision improvement with a more than 50% reduction in computation time compared to the segmental DTW approach on a Mandarin broadcast news corpus. © 2013 IEEE. |
URI: | https://scholars.lib.ntu.edu.tw/handle/123456789/498582 https://www.scopus.com/inward/record.uri?eid=2-s2.0-84890526441&doi=10.1109%2fICASSP.2013.6639334&partnerID=40&md5=cd1d9f75c9431093c16f8c89dec15998 |
ISSN: | 15206149 | DOI: | 10.1109/ICASSP.2013.6639334 | SDG/關鍵字: | Model based approach; Precision improvement; Query-by-example; Signal distribution; Speech patterns; Spoken Term Detection (STD); Spoken term detections; zero-resource; Signal processing; Speech recognition; Viterbi algorithm; Query processing |
顯示於: | 電機工程學系 |
在 IR 系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。