Multi-Temporal Lip-Audio Memory for Visual Speech Recognition.
Jeong Hun YeoMinsu KimYong Man RoPublished in: ICASSP (2023)
Keyphrases
- visual speech recognition
- visual speech
- lip reading
- dynamic textures
- hidden markov models
- speaker identification
- spatial and temporal
- multimedia
- spatio temporal
- noisy environments
- space time
- image sequences
- audio signals
- speech signal
- audio features
- facial expression recognition
- visual information
- gaussian mixture model
- speech recognition
- feature extraction