How to Teach DNNs to Pay Attention to the Visual Modality in Speech Recognition.
George SterpuChristian SaamNaomi HartePublished in: IEEE ACM Trans. Audio Speech Lang. Process. (2020)
Keyphrases
- speech recognition
- hidden markov models
- language model
- speech synthesis
- multi modal
- automatic speech recognition
- speech recognizer
- pattern recognition
- speech signal
- speech processing
- visual information
- speech recognition technology
- keyword spotting
- speaker identification
- visual features
- speech understanding
- speech recognition systems
- isolated word
- signal processing
- speaker independent
- speaker dependent
- speech retrieval
- learning styles
- speech recognition errors