Few-Shot Image Classification and Segmentation as Visual Question Answering Using Vision-Language Models.
Tian MengYang TaoRuilin LyuWuliang YinPublished in: CoRR (2024)
Keyphrases
- question answering
- language model
- image classification
- passage retrieval
- visual features
- information retrieval
- language modeling
- sentence retrieval
- document retrieval
- n gram
- visual information
- retrieval model
- probabilistic model
- information extraction
- speech recognition
- natural language processing
- question classification
- query expansion
- test collection
- feature extraction
- image representation
- bag of words
- natural language
- named entities
- image features
- key frames
- quantitative evaluation
- low level
- image annotation
- cross language
- audio visual
- relevance model
- translation model
- answer extraction
- vector space model
- video shots
- query terms
- information retrieval systems
- news video
- document collections
- video data