https://scholars.lib.ntu.edu.tw/handle/123456789/633098
標題: | MTI-Net: A Multi-Target Speech Intelligibility Prediction Model | 作者: | Zezario, Ryandhimas E. Fu, Szu Wei Chen, Fei CHIOU-SHANN FUH Wang, Hsin Min Tsao, Yu |
關鍵字: | self-supervised learning | speech intelligibility prediction | STOI | Subjective listening tests | WER | 公開日期: | 1-一月-2022 | 卷: | 2022-September | 來源出版物: | Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH | 摘要: | Recently, deep learning (DL)-based non-intrusive speech assessment models have attracted great attention. Many studies report that these DL-based models yield satisfactory assessment performance and good flexibility, but their performance in unseen environments remains a challenge. Furthermore, compared to quality scores, fewer studies elaborate deep learning models to estimate intelligibility scores. This study proposes a multi-task speech intelligibility prediction model, called MTI-Net, for simultaneously predicting human and machine intelligibility measures. Specifically, given a speech utterance, MTI-Net is designed to predict human subjective listening test results and word error rate (WER) scores. We also investigate several methods that can improve the prediction performance of MTI-Net. First, we compare different features (including low-level features and embeddings from self-supervised learning (SSL) models) and prediction targets of MTI-Net. Second, we explore the effect of transfer learning and multi-tasking learning on training MTI-Net. Finally, we examine the potential advantages of fine-tuning SSL embeddings. Experimental results demonstrate the effectiveness of using cross-domain features, multi-task learning, and fine-tuning SSL embeddings. Furthermore, it is confirmed that the intelligibility and WER scores predicted by MTI-Net are highly correlated with the ground-truth scores. |
URI: | https://scholars.lib.ntu.edu.tw/handle/123456789/633098 | ISSN: | 2308457X | DOI: | 10.21437/Interspeech.2022-10828 |
顯示於: | 資訊工程學系 |
在 IR 系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。