ChiQA: A Large Scale Image-based Real-World Question Answering Dataset for Multi-Modal Understanding.
Bingning WangFeiyang LvTing YaoYiming YuanJin MaYu LuoHaijin LiangPublished in: CoRR (2022)
Keyphrases
- multi modal
- question answering
- real world
- information retrieval
- natural language
- information extraction
- question classification
- natural language processing
- multi modality
- audio visual
- named entities
- qa clef
- passage retrieval
- cross language
- answering questions
- uni modal
- syntactic information
- data mining
- semantic roles
- candidate answers
- image annotation
- image content
- image classification
- natural language questions