One-Shot Voice Conversion by Vector Quantization
Journal
ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Journal Volume
2020-May
Pages
7734-7738
Date Issued
2020
Author(s)
Wu, D.-Y.
Abstract
In this paper, we propose a vector quantization (VQ) based one-shot voice conversion (VC) approach without any supervision on speaker label. We model the content embedding as a series of discrete codes and take the difference between quantize-before and quantize-after vector as the speaker embedding. We show that this approach has a strong ability to disentangle the content and speaker information with reconstruction loss only, and one-shot VC is thus achieved. © 2020 IEEE.
Subjects
disentangled representations; vector quantization; voice conversion
Type
conference paper