Unet-TTS: Improving Unseen Speaker and Style Transfer in One-shot Voice Cloning.
Rui LiDong PuMinnie HuangBill HuangPublished in: CoRR (2021)
Keyphrases
- text to speech
- prosodic features
- speech synthesis
- speaker verification
- training set
- speech recognition
- synthesized speech
- vocal tract
- feature extraction
- training examples
- real time
- noisy environments
- emotion recognition
- speaker recognition
- knowledge transfer
- speaker diarization
- transfer learning
- multi modal
- feature space
- case study
- information systems
- data sets