Login / Signup
Distinguishing Homophenes Using Multi-Head Visual-Audio Memory for Lip Reading.
Minsu Kim
Jeong Hun Yeo
Yong Man Ro
Published in:
AAAI (2022)
Keyphrases
</>
lip reading
visual speech
head tracking
visual information
speaker identification
hidden markov models
visual data
expression recognition
noisy environments
visual features
audio signals
particle filter
multimedia
visual cues
gaussian mixture model
audio visual
head movements
audio signal
multi modal