Multimodal Integration of Human-Like Attention in Visual Question Answering.
Ekta SoodFabian KögelPhilipp MüllerDominike ThomasMihai BaceAndreas BullingPublished in: CoRR (2021)
Keyphrases
- question answering
- natural language
- natural language processing
- information retrieval
- cross language
- information extraction
- question answering systems
- visual information
- multi modal
- natural language questions
- named entities
- question classification
- passage retrieval
- visual features
- qa clef
- textual entailment recognition
- syntactic information
- relation extraction
- low level
- audio visual
- artificial intelligence