https://scholars.lib.ntu.edu.tw/handle/123456789/607157
標題: | FragmentVC: Any-to-any voice conversion by end-to-end extracting and fusing fine-grained voice fragments with attention | 作者: | Lin Y.Y Chien C.-M Lin J.-H HUNG-YI LEE LIN-SHAN LEE |
關鍵字: | Any-to-any;Attention mechanism;Concatenative;Transformer;Voice conversion;Speech analysis;Speech recognition;Attention mechanisms;Hidden structures;Objective evaluation;Phonetic structure;Real-world scenario;Speaker verification;Subjective evaluations;Signal processing | 公開日期: | 2021 | 卷: | 2021-June | 起(迄)頁: | 5939-5943 | 來源出版物: | ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings | 摘要: | Any-to-any voice conversion aims to convert the voice from and to any speakers even unseen during training, which is much more challenging compared to one-to-one or many-to-many tasks, but much more attractive in real-world scenarios. In this paper we proposed FragmentVC, in which the latent phonetic structure of the utterance from the source speaker is obtained from Wav2Vec 2.0, while the spectral features of the utterance(s) from the target speaker are obtained from log mel-spectrograms. By aligning the hidden structures of the two different feature spaces with a two-stage training process, FragmentVC is able to extract fine-grained voice fragments from the target speaker utterance(s) and fuse them into the desired utterance, all based on the attention mechanism of Transformer as verified with analysis on attention maps, and is accomplished end-to-end. This approach is trained with reconstruction loss only without any disentanglement considerations between content and speaker information and doesn’t require parallel data. Objective evaluation based on speaker verification and subjective evaluation with MOS both showed that this approach outperformed SOTA approaches, such as AdaIN-VC and AUTOVC. ? 2021 IEEE |
URI: | https://www.scopus.com/inward/record.uri?eid=2-s2.0-85111438144&doi=10.1109%2fICASSP39728.2021.9413699&partnerID=40&md5=cd0e19d069322b478325a5ab88472748 https://scholars.lib.ntu.edu.tw/handle/123456789/607157 |
ISSN: | 15206149 | DOI: | 10.1109/ICASSP39728.2021.9413699 |
顯示於: | 電機工程學系 |
在 IR 系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。