Ventriloquist-Net: Leveraging Speech Cues for Emotive Talking Head Generation.
Deepan DasQadeer KhanDaniel CremersPublished in: ICIP (2022)
Keyphrases
- audio visual
- speech recognition
- emotion recognition
- speech signal
- multi modal
- visual cues
- high level
- prosodic features
- prior knowledge
- recognition engine
- noisy environments
- facial animation
- endpoint detection
- neural network
- hand movements
- text to speech
- head motion
- generation process
- hidden markov models
- video sequences
- information retrieval