From Pixels to Objects: Cubic Visual Attention for Visual Question Answering.
Jingkuan SongPengpeng ZengLianli GaoHeng Tao ShenPublished in: CoRR (2022)
Keyphrases
- question answering
- visual attention
- visual scene
- visual input
- focus of attention
- visual search
- object based visual attention
- saliency map
- eye tracking
- eye movements
- vision system
- visual saliency
- natural language processing
- information retrieval
- information extraction
- visual information
- higher level
- visual motion
- salient regions
- natural language
- passage retrieval
- syntactic information
- input image
- qa clef
- spatial relations
- question classification
- visual features
- question answering systems
- object recognition
- computer vision
- qa systems
- visual data
- machine learning
- answer validation
- image retrieval
- knowledge base
- answering questions
- natural language questions