Login / Signup
An Interpretable Multimodal Visual Question Answering System using Attention-based Weighted Contextual Features.
Yu Wang
Yilin Shen
Hongxia Jin
Published in:
AAMAS (2020)
Keyphrases
</>
contextual features
visual features
contextual information
visual information
question answering
multi modal
conditional random fields
natural language
image processing
low level
graphical models
audio visual
machine learning
computer vision
spatial relations