Multimodal Integration of Human-Like Attention in Visual Question Answering.
Ekta SoodFabian KögelPhilipp MüllerDominike ThomasMihai BâceAndreas BullingPublished in: CVPR Workshops (2023)
Keyphrases
- question answering
- question classification
- information extraction
- natural language processing
- natural language
- information retrieval
- cross language
- multi modal
- passage retrieval
- natural language questions
- named entities
- visual information
- qa clef
- open domain question answering
- sentence retrieval
- question answering systems
- low level
- candidate answers
- relation extraction
- syntactic information
- audio visual
- expert systems