Multimodal Encoder-Decoder Attention Networks for Visual Question Answering.
Chongqing ChenDezhi HanJun WangPublished in: IEEE Access (2020)
Keyphrases
- question answering
- decoding process
- video codec
- question classification
- distributed video coding
- natural language processing
- passage retrieval
- information retrieval
- qa clef
- named entities
- natural language questions
- cross language
- open domain question answering
- syntactic information
- natural language
- visual information
- visual features
- question answering systems
- rate distortion
- sentence retrieval
- answer validation
- multi modal
- information extraction
- relation extraction
- textual entailment recognition
- wyner ziv
- semantic roles
- audio visual
- low level
- video coding
- test set
- video sequences