Zero-Shot and Few-Shot Video Question Answering with Multi-Modal Prompts.
Deniz EnginYannis AvrithisPublished in: CoRR (2023)
Keyphrases
- multi modal
- question answering
- video content
- video data
- key frames
- semantic concepts
- video shots
- video sequences
- video search
- natural language processing
- question classification
- information extraction
- natural language
- news video
- video retrieval
- multiple modalities
- information retrieval
- question answering systems
- video database
- video frames
- cross language
- video streams
- video analysis
- passage retrieval
- audio visual
- syntactic information
- answer extraction
- natural language questions
- high dimensional
- multimedia
- answer validation
- video clips
- visual data
- image annotation
- visual features
- qa clef
- answering questions
- graph cuts
- artificial intelligence