Depth and Video Segmentation Based Visual Attention for Embodied Question Answering.
Haonan LuoGuosheng LinYazhou YaoFayao LiuZichuan LiuZhenmin TangPublished in: IEEE Trans. Pattern Anal. Mach. Intell. (2023)
Keyphrases
- question answering
- video segmentation
- visual attention
- saliency map
- eye tracking
- video sequences
- video frames
- eye movements
- natural language processing
- video analysis
- visual search
- vision system
- segmentation method
- information retrieval
- information extraction
- qa clef
- salient regions
- higher level
- natural language
- passage retrieval
- question answering systems
- machine learning
- visual saliency
- natural language questions
- higher order
- bag of words
- video data
- human computer interaction
- natural images
- qa systems
- answer extraction
- user interface