Login / Signup
X-LXMERT: Paint, Caption and Answer Questions with Multi-Modal Transformers.
Jaemin Cho
Jiasen Lu
Dustin Schwenk
Hannaneh Hajishirzi
Aniruddha Kembhavi
Published in:
CoRR (2020)
Keyphrases
</>
multi modal
answer questions
visual features
multi modality
cross modal
high dimensional
semantic concepts
audio visual
image annotation
humanoid robot
video retrieval
news video
image classification
fusing multiple
multimedia retrieval
image analysis
keywords
controlled natural language