Login / Signup
Visual Context-driven Audio Feature Enhancement for Robust End-to-End Audio-Visual Speech Recognition.
Joanna Hong
Minsu Kim
Daehun Yoo
Yong Man Ro
Published in:
CoRR (2022)
Keyphrases
</>
end to end
audio visual speech recognition
visual context
audio visual
multi stream
temporal context
semantic context
noisy environments
multimedia
audio signal
image features
feature set
computer vision
feature vectors
speech recognition
visual information
hidden markov models
visual scene
visual speech