Discourse-Level Prosody Modeling with a Variational Autoencoder for Non-Autoregressive Expressive Speech Synthesis.
Ning-Qian WuZhaoci LiuZhen-Hua LingPublished in: ICASSP (2022)
Keyphrases
- speech synthesis
- autoregressive
- speech recognition
- text to speech
- moving average
- prosodic features
- vocal tract
- non stationary
- random fields
- gaussian markov random field
- image segmentation
- random field models
- sar images
- natural images
- conditional random fields
- energy function
- gray scale
- model selection
- restricted boltzmann machine
- markov random field
- optical flow
- multiscale
- image sequences
- arma model
- computer vision