Co-attending Free-form Regions and Detections with Multi-modal Multiplicative Feature Embedding for Visual Question Answering.
Pan LuHongsheng LiWei ZhangJianyong WangXiaogang WangPublished in: CoRR (2017)
Keyphrases
- multi modal
- question answering
- free form
- cross modal
- video search
- image features
- information retrieval
- single modality
- natural language processing
- question classification
- passage retrieval
- natural language
- complex scenes
- cross language
- information extraction
- syntactic information
- qa clef
- feature vectors
- audio visual
- qa systems
- semantic concepts
- high dimensional
- visual information
- image annotation
- visual features
- machine learning
- answering questions
- image classification
- similarity measure
- high level
- image processing
- multimedia
- artificial intelligence
- question answering systems
- language model
- feature set
- relevance feedback
- multiple modalities
- object recognition