https://scholars.lib.ntu.edu.tw/handle/123456789/580917
標題: | WG-WaveNet: Real-time high-fidelity speech synthesis without GPU | 作者: | Hsu P.-C HUNG-YI LEE |
關鍵字: | Graphics processing unit; Speech synthesis; Computational resources; Flow-based models; Frequency domains; High-fidelity; Loss functions; Speech waveforms; Training data; Waveform generation; Speech communication | 公開日期: | 2020 | 卷: | 2020-October | 起(迄)頁: | 210-214 | 來源出版物: | Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH | 摘要: | In this paper, we propose WG-WaveNet, a fast, lightweight, and high-quality waveform generation model. WG-WaveNet is composed of a compact flow-based model and a post-filter. The two components are jointly trained by maximizing the likelihood of the training data and optimizing loss functions on the frequency domains. As we design a flow-based model that is heavily compressed, the proposed model requires much less computational resources compared to other waveform generation models during both training and inference time; even though the model is highly compressed, the post-filter maintains the quality of generated waveform. Our PyTorch implementation can be trained using less than 8 GB GPU memory and generates audio samples at a rate of more than 960 kHz on an NVIDIA 1080Ti GPU. Furthermore, even if synthesizing on a CPU, we show that the proposed method is capable of generating 44.1 kHz speech waveform 1.2 times faster than real-time. Experiments also show that the quality of generated audio is comparable to those of other methods. Audio samples are publicly available online. ? 2020 International Speech Communication Association. All rights reserved. |
URI: | https://www.scopus.com/inward/record.uri?eid=2-s2.0-85098206854&doi=10.21437%2fInterspeech.2020-1736&partnerID=40&md5=ea7e26ff29eb02dbc24b649fa4dd0858 https://scholars.lib.ntu.edu.tw/handle/123456789/580917 |
ISSN: | 2308457X | DOI: | 10.21437/Interspeech.2020-1736 |
顯示於: | 電機工程學系 |
在 IR 系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。