Zero-Shot and Few-Shot Video Question Answering with Multi-Modal Prompts.
Deniz EnginYannis AvrithisPublished in: ICCV (Workshops) (2023)
Keyphrases
- multi modal
- question answering
- video content
- video data
- video shots
- semantic concepts
- video sequences
- key frames
- video search
- natural language processing
- natural language
- information retrieval
- multiple modalities
- question classification
- news video
- qa clef
- high dimensional
- video streams
- audio visual
- information extraction
- cross language
- video frames
- passage retrieval
- natural language questions
- video clips
- video database
- video retrieval
- multimedia
- video analysis
- syntactic information
- image annotation
- visual features
- question answering systems
- answer validation
- machine learning
- qa systems
- answer extraction
- high level