Jointly Learning Attentions with Semantic Cross-Modal Correlation for Visual Question Answering.
Liangfu CaoLianli GaoJingkuan SongXing XuHeng Tao ShenPublished in: ADC (2017)
Keyphrases
- question answering
- cross modal
- perceptual information
- natural language
- multi modal
- visual recognition
- natural language processing
- information retrieval
- low level
- information extraction
- named entities
- learning tasks
- multimedia retrieval
- question answering systems
- active learning
- high level
- semantic concepts
- metadata
- semantic roles