Enhancing query expansion for semantic retrieval of spoken content with automatically discovered acoustic patterns
Journal
ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Pages
8297-8301
Date Issued
2013
Author(s)
Abstract
Query expansion techniques were originally developed for text information retrieval in order to retrieve the documents not containing the query terms but semantically related to the query. This is achieved by assuming the terms frequently occurring in the top-ranked documents in the first-pass retrieval results to be query-related and using them to expand the query to do the second-pass retrieval. However, when this approach was used for spoken content retrieval, the inevitable recognition errors and the OOV problems in ASR make it difficult for many query-related terms to be included in the expanded query, and much of the information carried by the speech signal is lost during recognition and not recoverable. In this paper, we propose to use a second ASR engine based on acoustic patterns automatically discovered from the spoken archive used for retrieval. These acoustic patterns are discovered directly based on the signal characteristics, and therefore can compensate for the information lost during recognition to a good extent. When a text query is entered, the system generates the first-pass retrieval results based on the transcriptions of the spoken segments obtained via the conventional ASR. The acoustic patterns frequently occurring in the spoken segments ranked on top of the first-pass results are considered as query-related, and the spoken segments containing these query-related acoustic patterns are retrieved. In this way, even though some query-related terms are OOV or incorrectly recognized, the segments including these terms can still be retrieved by acoustic patterns corresponding to these terms. Preliminary experiments performed on Mandarin broadcast news offered very encouraging results. © 2013 IEEE.
Subjects
Acoustic Pattern Discovery; Query Expansion
Type
conference paper