Audio2Gestures: Generating Diverse Gestures from Speech Audio with Conditional Variational Autoencoders.
Jing LiDi KangWenjie PeiXuefei ZheYing ZhangZhenyu HeLinchao BaoPublished in: ICCV (2021)
Keyphrases
- audio visual
- audio stream
- audio signals
- broadcast news
- gesture recognition
- multimedia
- hidden markov models
- cepstral features
- hand gestures
- digital audio
- audio features
- visual information
- speaker identification
- prosodic features
- audio recordings
- audio video
- emotion recognition
- human robot interaction
- speech music discrimination
- hand movements
- text to speech
- sign language
- multi modal
- acoustic signals
- visual data
- speech processing
- multi stream
- visual speech
- spoken words
- speech synthesis
- denoising
- spoken documents
- linear predictive coding
- speaker verification
- body movements
- neural network
- graphical models
- optical flow
- image processing