TVQA+: Spatio-Temporal Grounding for Video Question Answering.
Jie LeiLicheng YuTamara L. BergMohit BansalPublished in: ACL (2020)
Keyphrases
- question answering
- spatio temporal
- human actions
- passage retrieval
- information retrieval
- video sequences
- natural language
- natural language processing
- cross language
- qa clef
- information extraction
- video data
- syntactic information
- multimedia
- question classification
- open domain question answering
- image sequences
- named entities
- video frames
- video retrieval
- natural language questions
- sentence retrieval
- answer extraction
- video database
- video content
- question answering systems
- relation extraction
- answer validation
- semantic roles
- wordnet
- audio visual
- visual data
- key frames
- speech transcripts