HierSpeech++: Bridging the Gap between Semantic and Acoustic Representation of Speech by Hierarchical Variational Inference for Zero-shot Speech Synthesis.
Sang-Hoon LeeHa-Yeong ChoiSeung-Bin KimSeong-Whan LeePublished in: CoRR (2023)
Keyphrases
- speech synthesis
- prosodic features
- speech recognition
- text to speech
- variational inference
- vocal tract
- bayesian inference
- posterior distribution
- variational methods
- pattern recognition
- natural language
- hidden markov models
- closed form
- latent dirichlet allocation
- automatic speech recognition
- neural network
- language model
- probabilistic model
- machine learning