VoxGenesis: Unsupervised Discovery of Latent Speaker Manifold for Speech Synthesis.

Weiwei Lin Chenhang He Man-Wai Mak Jiachen Lian Kong Aik Lee

Published in: CoRR (2024)

Keyphrases

speech synthesis
speech recognition
prosodic features
vocal tract
text to speech
latent variables
hidden markov models
automatic speech recognition
language model
manifold learning
latent space
high dimensional
low dimensional
speaker diarization
speaker identification
speech signal
lower dimensional
pattern recognition
speaker recognition
category labels
speech corpus
audio visual
linear prediction
image acquisition
image set
fisher information
shortest path
multi modal
probabilistic model
machine learning