MPE4G : Multimodal Pretrained Encoder for Co-Speech Gesture Generation.
Gwantae KimSeonghyeok NohInsung HamHanseok KoPublished in: ICASSP (2023)
Keyphrases
- multimodal interfaces
- multimodal interaction
- human computer interaction
- multi stream
- learning mechanism
- audio visual
- gesture recognition
- closely related
- hidden markov models
- bayesian networks
- bit rate
- rate distortion
- search algorithm
- rate control
- user interface
- low complexity
- multi modal
- future trends
- speech synthesis
- pointing gestures
- text to speech
- speaker identification
- hand gestures
- speech recognition
- motion estimation
- spoken language
- continuous stream