One-Shot Voice Conversion by Vector Quantization

Wu, D.-Y.; HUNG-YI LEE

doi:10.1109/ICASSP40776.2020.9053854

One-Shot Voice Conversion by Vector Quantization

Journal

ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

Journal Volume

2020-May

Pages

7734-7738

Date Issued

2020

Author(s)

Wu, D.-Y.

HUNG-YI LEE

DOI

10.1109/ICASSP40776.2020.9053854

URI

https://www.scopus.com/inward/record.url?eid=2-s2.0-85089227176&partnerID=40&md5=250cdc69c2b513111c3181093ce099d7

https://scholars.lib.ntu.edu.tw/handle/123456789/558970

Abstract

In this paper, we propose a vector quantization (VQ) based one-shot voice conversion (VC) approach without any supervision on speaker label. We model the content embedding as a series of discrete codes and take the difference between quantize-before and quantize-after vector as the speaker embedding. We show that this approach has a strong ability to disentangle the content and speaker information with reconstruction loss only, and one-shot VC is thus achieved. © 2020 IEEE.

Subjects

disentangled representations; vector quantization; voice conversion

SDGs

[SDGs]SDG10

Type

conference paper

One-Shot Voice Conversion by Vector Quantization

關於 (About)

聯絡資訊 (Contact Us)

相關網站 (Useful Links)

關於開放取用 (Open Access, OA)

出版社期刊論文授權政策 (Copyright)

使用說明 (Instructions)

登入說明 (Sign-in)

匯入著作 (Submission)