SkinAugment: Auto-Encoding Speaker Conversions for Automatic Speech Translation.
Arya D. McCarthyLiezl PuzonJuan Miguel PinoPublished in: CoRR (2020)
Keyphrases
- speech recognition
- speaker recognition
- automatic speech recognition
- audio visual
- speaker verification
- speaker identification
- speaker diarization
- speech signal
- neural network
- speaker dependent
- automatic speech recognition systems
- semi automatic
- vector quantization
- broadcast news
- multi modal
- language model
- pattern recognition
- prosodic features
- encoding scheme
- synthesized speech
- speech music discrimination
- speech recognizer
- speech synthesis
- text to speech
- dialogue system
- gaussian mixture model
- feature extraction