Diverse and Expressive Speech Prosody Prediction with Denoising Diffusion Probabilistic Model.
Xiang LiSongxiang LiuMax W. Y. LamZhiyong WuChao WengHelen MengPublished in: CoRR (2023)
Keyphrases
- denoising
- probabilistic model
- text to speech
- speech synthesis
- image denoising
- diffusion processes
- nonlinear diffusion
- prediction accuracy
- speech recognition
- prosodic features
- linear prediction
- wavelet domain
- prediction algorithm
- audio visual
- bayesian networks
- translation invariant
- prediction error
- language model
- multi stream
- prediction model
- anisotropic diffusion
- image processing
- wide variety
- noisy images
- wavelet packet
- denoising algorithm
- natural images
- multi modal
- diffusion process
- speech signal
- emotion recognition
- automatic speech recognition
- spoken language
- real world
- social networks