Unsupervised Quantized Prosody Representation for Controllable Speech Synthesis.
Yutian WangYuankun XieKun ZhaoHui WangQin ZhangPublished in: ICME (2022)
Keyphrases
- speech synthesis
- speech recognition
- text to speech
- prosodic features
- vocal tract
- machine learning
- data driven
- unsupervised learning
- speech corpus
- neural network
- feature representation
- image representation
- semi supervised
- image retrieval
- supervised learning
- supervised classification
- representation scheme
- information bottleneck
- video sequences
- computer vision
- genetic algorithm