Investigating on incorporating pretrained and learnable speaker representations for multi-speaker multi-style text-to-speech

Chien C.-M; Lin J.-H; Huang C.-Y; Hsu P.-C; HUNG-YI LEE; Chien C.-M;Lin J.-H;Huang C.-Y;Hsu P.-C;Lee H.-Y.

doi:10.1109/ICASSP39728.2021.9413880

Investigating on incorporating pretrained and learnable speaker representations for multi-speaker multi-style text-to-speech

Journal

ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

Journal Volume

2021-June

Pages

8588-8592

Date Issued

2021

Author(s)

Chien C.-M

Lin J.-H

Huang C.-Y

Hsu P.-C

HUNG-YI LEE

DOI

10.1109/ICASSP39728.2021.9413880

URI

https://www.scopus.com/inward/record.uri?eid=2-s2.0-85106397888&doi=10.1109%2fICASSP39728.2021.9413880&partnerID=40&md5=9637afe10b6f5e4be8b287e41c0f7113

https://scholars.lib.ntu.edu.tw/handle/123456789/607161

Abstract

The few-shot multi-speaker multi-style voice cloning task is to synthesize utterances with voice and speaking style similar to a reference speaker given only a few reference samples. In this work, we investigate different speaker representations and proposed to integrate pretrained and learnable speaker representations. Among different types of embeddings, the embedding pretrained by voice conversion achieves the best performance. The FastSpeech 2 model combined with both pretrained and learnable speaker representations shows great generalization ability on few-shot speakers and achieved 2nd place in the one-shot track of the ICASSP 2021 M2VoC challenge. ? 2021 IEEE

Subjects

Few-shot

Multi-speaker text-to-speech

Speaker representation

Clone cells

Embeddings

Generalization ability

Speaking styles

Text to speech

Voice conversion

Signal processing

SDGs

[SDGs]SDG4

Type

conference paper

Investigating on incorporating pretrained and learnable speaker representations for multi-speaker multi-style text-to-speech

關於 (About)

聯絡資訊 (Contact Us)

相關網站 (Useful Links)

關於開放取用 (Open Access, OA)

出版社期刊論文授權政策 (Copyright)

使用說明 (Instructions)

登入說明 (Sign-in)

匯入著作 (Submission)