Deep Gaussian process based multi-speaker speech synthesis with latent speaker representation.
Kentaro MitsuiTomoki KoriyamaHiroshi SaruwatariPublished in: Speech Commun. (2021)
Keyphrases
- gaussian process
- speech synthesis
- prosodic features
- speech recognition
- latent space
- vocal tract
- latent variables
- gaussian processes
- text to speech
- speaker verification
- gaussian process regression
- automatic speech recognition
- regression model
- gaussian process classification
- bayesian framework
- model selection
- approximate inference
- audio visual
- expectation propagation
- covariance function
- semi supervised
- hyperparameters
- random variables
- speech signal
- language model
- probability distribution
- hidden markov models
- image segmentation
- co occurrence
- sparse approximations
- computer vision
- pattern recognition
- graphical models
- noisy images