How to Teach DNNs to Pay Attention to the Visual Modality in Speech Recognition.
George SterpuChristian SaamNaomi HartePublished in: CoRR (2020)
Keyphrases
- speech recognition
- hidden markov models
- speech processing
- pattern recognition
- language model
- multi modal
- speech synthesis
- visual information
- speech signal
- speaker identification
- speech recognizer
- speech recognition technology
- speech understanding
- automatic speech recognition
- speech recognition systems
- speech recognizers
- keyword spotting
- isolated word
- noisy environments
- visual features
- speech retrieval
- low level
- natural language