Attention-Based Keyword Localisation in Speech Using Visual Grounding.
Kayode OlaleyeHerman KamperPublished in: Interspeech (2021)
Keyphrases
- selective attention
- keywords
- visual information
- low level
- visual cues
- visual perception
- speaker recognition
- visual features
- speech recognition
- audio visual
- visual field
- speech signal
- focus of attention
- keyword extraction
- visual saliency
- speech synthesis
- visual stimuli
- content based video retrieval
- visual speech
- information retrieval
- recognition engine
- human vision
- automatic speech recognition
- receptive fields
- visual attention
- computer vision