MM-Reasoner: A Multi-Modal Knowledge-Aware Framework for Knowledge-Based Visual Question Answering.
Mahmoud KhademiZiyi YangFelipe FrujeriChenguang ZhuPublished in: EMNLP (Findings) (2023)
Keyphrases
- multi modal
- question answering
- cross modal
- knowledge base
- audio visual
- passage retrieval
- information extraction
- information retrieval
- natural language
- natural language processing
- multi modality
- video search
- natural language questions
- multiple modalities
- domain knowledge
- high dimensional
- expert systems
- qa clef
- candidate answers
- single modality
- question classification
- image annotation
- visual features
- feature space
- high level
- feature selection