Question-Instructed Visual Descriptions for Zero-Shot Video Question Answering.
David RomeroThamar SolorioPublished in: CoRR (2024)
Keyphrases
- question answering
- question classification
- question answering systems
- qa clef
- natural language questions
- qa systems
- answer extraction
- open domain question answering
- answer validation
- answering questions
- candidate answers
- visual data
- video data
- natural language processing
- natural language
- named entities
- question answer pairs
- information retrieval
- cross language
- high level
- video search
- video sequences
- information extraction
- visual information
- syntactic information
- video content
- low level
- relation extraction
- multimedia
- passage retrieval
- news video
- visual features
- video frames
- key frames
- textual entailment recognition
- semantic roles
- multi modal
- video shots
- video retrieval
- search engine
- artificial intelligence