Interactive Spoken Content Retrieval by Deep Reinforcement Learning
Journal
IEEE/ACM Transactions on Audio, Speech, and Language Processing
Date Issued
2018
Author(s)
Abstract
For text content retrieval, the user can easily scan through and select from a list of retrieved items. This is impossible for spoken content retrieval, because the retrieved items are not easily displayed on-screen. In addition, due to the high degree of uncertainty for speech recognition, retrieval results can be very noisy. One way to counter such difficulties is through user-machine interaction. The machine can take different actions to interact with the user to obtain better retrieval results before showing them to the user. For example, the machine can request extra information from the user, return a list of topics for the user to select from, and so on. In this paper, we propose using deep-Q-network (DQN) to determine the machine actions for interactive spoken content retrieval. DQN bypasses the need to estimate hand-crafted states, and directly determines the best action based on the present retrieval results even without any human knowledge. It is shown to achieve significantly better performance as compared with the previous hand-crafted states. We further find that double DQN and dueling DQN improve the naive version. © 2014 IEEE.
Subjects
deep-Q-learning; reinforcement learning; Spoken content retrieval; user-machine interaction
Type
journal article