Sign in
ZET-Speech: Zero-shot adaptive Emotion-controllable Text-to-Speech Synthesis with Diffusion and Style-based Models.
Minki Kang
Wooseok Han
Sung Ju Hwang
Eunho Yang
Published in:
CoRR (2023)
Keyphrases
</>
text to speech synthesis
text to speech
probabilistic model
statistical models
prior knowledge
statistical model
neural network
learning algorithm
social networks
computational model
anisotropic diffusion
emotion recognition
speech synthesis
diffusion models