Zero-shot text-to-speech synthesis conditioned using self-supervised speech representation model.
Kenichi FujitaTakanori AshiharaHiroki KanagawaTakafumi MoriyaYusuke IjimaPublished in: CoRR (2023)
Keyphrases
- computational model
- text to speech synthesis
- prior knowledge
- probabilistic model
- formal model
- image representation
- experimental data
- statistical model
- mathematical model
- pattern recognition
- genetic algorithm
- management system
- data sets
- theoretical analysis
- cost function
- process model
- objective function
- bayesian networks
- gaussian mixture model
- social networks
- spatial structure