Analyzing Utility of Visual Context in Multimodal Speech Recognition Under Noisy Conditions.

Tejas Srinivasan Ramon Sanabria Florian Metze

Published in: CoRR (2019)

Keyphrases

speech recognition
visual context
noisy environments
speech signal
hidden markov models
automatic speech recognition
pattern recognition
temporal context
semantic context
language model
speaker identification
multi modal
scene interpretation
object detection
speech recognition systems
video annotation
audio visual
low level
multimedia
neural network