Improving robustness of one-shot voice conversion with deep discriminative speaker encoder.
Hongqiang DuLei XiePublished in: CoRR (2021)
Keyphrases
- probabilistic model
- bit rate
- speech recognition
- discriminative power
- synthesized speech
- speaker recognition
- multi modal
- rate distortion
- face recognition
- speaker verification
- feature extraction
- video compression
- audio visual
- emotion recognition
- speaker identification
- mel frequency cepstral coefficients
- prosodic features
- successive approximation