Beyond Bilinear: Generalized Multi-modal Factorized High-order Pooling for Visual Question Answering.
Zhou YuJun YuChenchao XiangJianping FanDacheng TaoPublished in: CoRR (2017)
Keyphrases
- multi modal
- high order
- question answering
- cross modal
- higher order
- video search
- question classification
- single modality
- natural language
- information retrieval
- information extraction
- question answering systems
- pairwise
- natural language questions
- syntactic information
- natural language processing
- passage retrieval
- qa clef
- visual information
- audio visual
- visual features
- markov random field
- high dimensional
- multiple modalities
- singular value decomposition
- answer extraction
- image annotation
- artificial intelligence