Variational Auto-Encoder based Mandarin Speech Cloning.
Qingyu XingXiaohan MaPublished in: CoRR (2022)
Keyphrases
- speech recognition
- broadcast news
- emotion recognition
- prosodic features
- spoken document retrieval
- speaker independent
- speech synthesis
- speaker identification
- speech signal
- automatic speech recognition
- image segmentation
- hidden markov models
- speech recognizer
- bit rate
- rate distortion
- text to speech
- video codec
- optical flow
- language model
- motion estimation
- optical flow computation
- computer vision
- spontaneous speech
- methods in computer vision
- image processing
- speaker recognition
- speaker verification
- spoken language
- noisy environments
- audio visual
- low complexity
- variational framework
- error control
- video search
- decoding process
- variational methods
- recognition engine
- gaussian mixture model
- endpoint detection