Multi-Scale Speaker Vectors for Zero-Shot Speech Synthesis.
Tristin CoryRazib IqbalPublished in: COMPSAC (2022)
Keyphrases
- speech synthesis
- speech recognition
- multiscale
- prosodic features
- vocal tract
- text to speech
- automatic speech recognition
- scale space
- hidden markov models
- vector space
- pattern recognition
- edge detection
- natural images
- language model
- image segmentation
- image processing
- deep structure
- image representation
- speaker dependent
- speaker identification
- multiple scales
- speech corpus
- optic flow
- coarse to fine
- principal components
- noisy environments
- speech signal
- speaker verification
- speaker recognition
- audio visual
- speaker diarization
- feature vectors
- vector quantization
- wavelet transform
- machine learning