Discourse-Level Prosody Modeling with a Variational Autoencoder for Non-Autoregressive Expressive Speech Synthesis.

Ning-Qian Wu Zhaoci Liu Zhen-Hua Ling

Published in: ICASSP (2022)

Keyphrases

speech synthesis
autoregressive
speech recognition
text to speech
moving average
prosodic features
vocal tract
non stationary
random fields
gaussian markov random field
image segmentation
random field models
sar images
natural images
conditional random fields
energy function
gray scale
model selection
restricted boltzmann machine
markov random field
optical flow
multiscale
image sequences
arma model
computer vision