Login / Signup
Improving multi-speaker TTS prosody variance with a residual encoder and normalizing flows.
Iván Vallés-Pérez
Julian Roth
Grzegorz Beringer
Roberto Barra-Chicote
Jasha Droppo
Published in:
CoRR (2021)
Keyphrases
</>
text to speech
prosodic features
speech synthesis
speaker verification
standard deviation
speech recognition
synthesized speech
rate distortion
bit rate
audio visual
inter frame
noisy environments
distributed video coding
video codec
low complexity
residual signal
motion estimation