Text-to-speech synthesis based on latent variable conversion using diffusion probabilistic model and variational autoencoder.
Yusuke YasudaTomoki TodaPublished in: CoRR (2022)
Keyphrases
- probabilistic model
- latent variables
- text to speech synthesis
- graphical models
- generative model
- text to speech
- posterior distribution
- latent variable models
- anisotropic diffusion
- bayesian inference
- diffusion process
- language model
- hidden variables
- gaussian process
- conditional random fields
- hierarchical model
- approximate inference
- image segmentation
- real valued
- optical flow
- bayesian networks
- topic models
- expectation maximization
- structured prediction
- latent structure
- data sets
- co occurrence
- collaborative filtering
- information extraction
- probabilistic latent semantic analysis
- latent space
- machine learning