Generating Diverse and Natural Text-to-Speech Samples Using a Quantized Fine-Grained VAE and Autoregressive Prosody Prior.
Guangzhi SunYu ZhangRon J. WeissYuan CaoHeiga ZenAndrew RosenbergBhuvana RamabhadranYonghui WuPublished in: ICASSP (2020)
Keyphrases
- fine grained
- text to speech
- autoregressive
- speech synthesis
- coarse grained
- moving average
- non stationary
- gaussian markov random field
- autoregressive model
- random fields
- text to speech synthesis
- prosodic features
- access control
- word processing
- random field models
- prior knowledge
- machine learning
- information retrieval