Video Question Answering with Iterative Video-Text Co-Tokenization.
A. J. PiergiovanniKairo MortonWeicheng KuoMichael S. RyooAnelia AngelovaPublished in: CoRR (2022)
Keyphrases
- question answering
- named entities
- video data
- information retrieval
- video sequences
- information extraction
- natural language
- video content
- syntactic information
- text mining
- question answering systems
- natural language processing
- text documents
- video frames
- text summarization
- qa clef
- relation extraction
- key frames
- keywords
- multimedia