Keyword-Aware Relative Spatio-Temporal Graph Networks for Video Question Answering.
Yi ChengHehe FanDongyun LinYing SunMohan S. KankanhalliJoo-Hwee LimPublished in: CoRR (2023)
Keyphrases
- question answering
- spatio temporal
- human actions
- information extraction
- keywords
- named entities
- natural language processing
- information retrieval
- natural language
- qa clef
- multimedia
- video sequences
- sentence retrieval
- passage retrieval
- question classification
- video content
- syntactic information
- video data
- image sequences
- video frames
- cross language
- graph structure
- semantic roles
- question answering systems
- action recognition
- qa systems
- multi modal
- natural language questions
- video retrieval
- keyword search
- video shots
- structured data
- language model
- textual entailment recognition