Unet-TTS: Improving Unseen Speaker and Style Transfer in One-shot Voice Cloning.

Rui Li Dong Pu Minnie Huang Bill Huang

Published in: CoRR (2021)

Keyphrases

text to speech
prosodic features
speech synthesis
speaker verification
training set
speech recognition
synthesized speech
vocal tract
feature extraction
training examples
real time
noisy environments
emotion recognition
speaker recognition
knowledge transfer
speaker diarization
transfer learning
multi modal
feature space
case study
information systems
data sets