https://scholars.lib.ntu.edu.tw/handle/123456789/633658
標題: | DDOS: A MOS Prediction Framework utilizing Domain Adaptive Pre-training and Distribution of Opinion Scores | 作者: | Tseng, Wei Cheng Kao, Wei Tsung HUNG-YI LEE |
關鍵字: | MOS predicion | self-supervised learning | 公開日期: | 1-一月-2022 | 卷: | 2022-September | 來源出版物: | Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH | 摘要: | Mean opinion score (MOS) is a typical subjective evaluation metric for speech synthesis systems. Since collecting MOS is time-consuming, it would be desirable if there are accurate MOS prediction models for automatic evaluation. In this work, we propose DDOS, a novel MOS prediction model. DDOS utilizes domain-adaptive pre-training to further pre-train self-supervised learning models on synthetic speech. And a proposed module is added to model the opinion score distribution of each utterance. With the proposed components, DDOS outperforms previous works on BVCC dataset. And the zero-shot transfer result on BC2019 dataset is significantly improved. DDOS also wins second place in Interspeech 2022 VoiceMOS challenge in terms of system-level score. |
URI: | https://scholars.lib.ntu.edu.tw/handle/123456789/633658 | ISSN: | 2308457X | DOI: | 10.21437/Interspeech.2022-11247 |
顯示於: | 電機工程學系 |
在 IR 系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。