Integrating multimodal features by a two-way co-attention mechanism for visual question answering.
Himanshu SharmaSwati SrivastavaPublished in: Multim. Tools Appl. (2024)
Keyphrases
- question answering
- question classification
- low level
- information extraction
- attention mechanism
- natural language questions
- semantic roles
- question answering systems
- information retrieval
- natural language processing
- natural language
- cross language
- qa clef
- feature vectors
- feature extraction
- co occurrence
- passage retrieval
- answering questions
- visual information
- visual attention
- natural images
- candidate answers
- answer extraction
- multi modal
- high level
- artificial intelligence