We Need Variations in Speech Synthesis: Sub-center Modelling for Speaker Embeddings.
Ismail Rasim UlgenCarlos BussoJohn H. L. HansenBerrak SismanPublished in: CoRR (2024)
Keyphrases
- speech synthesis
- speech recognition
- prosodic features
- vocal tract
- text to speech
- hidden markov models
- automatic speech recognition
- language model
- speaker identification
- speech signal
- speaker dependent
- speech corpus
- pattern recognition
- vector space
- noisy environments
- speaker verification
- dimensionality reduction
- euclidean space
- low dimensional
- speaker diarization
- case study
- feature selection
- data sets