Generalized pyramid co-attention with learnable aggregation net for video question answering.
Lianli GaoTangming ChenXiangpeng LiPengpeng ZengLei ZhaoYuan-Fang LiPublished in: Pattern Recognit. (2021)
Keyphrases
- question answering
- natural language processing
- video sequences
- natural language
- information extraction
- multimedia
- information retrieval
- video data
- question classification
- cross language
- passage retrieval
- syntactic information
- video content
- sentence retrieval
- qa clef
- question answering systems
- named entities
- answer validation
- relation extraction
- natural language questions
- qa systems
- multi modal
- open domain question answering
- answering questions
- visual data
- image representation