Creating Song From Lip and Tongue Videos With a Convolutional Vocoder.
Jianyu ZhangPierre RousselBruce DenbyPublished in: IEEE Access (2021)
Keyphrases
- motion features
- video sequences
- video frames
- computer aided
- motion analysis
- audio features
- video data
- ultrasound image sequences
- video analysis
- ultrasound images
- user generated
- event recognition
- key frames
- human activities
- action recognition
- face detection
- video clips
- deep learning
- natural images
- optical flow
- machine learning