VoxGenesis: Unsupervised Discovery of Latent Speaker Manifold for Speech Synthesis.
Weiwei LinChenhang HeMan-Wai MakJiachen LianKong Aik LeePublished in: CoRR (2024)
Keyphrases
- speech synthesis
- speech recognition
- prosodic features
- vocal tract
- text to speech
- latent variables
- hidden markov models
- automatic speech recognition
- language model
- manifold learning
- latent space
- high dimensional
- low dimensional
- speaker diarization
- speaker identification
- speech signal
- lower dimensional
- pattern recognition
- speaker recognition
- category labels
- speech corpus
- audio visual
- linear prediction
- image acquisition
- image set
- fisher information
- shortest path
- multi modal
- probabilistic model
- machine learning