Multimodal CLIP Inference for Meta-Few-Shot Image Classification.
Constance FerraguPhilomène ChagniotVincent CoyettePublished in: CoRR (2024)
Keyphrases
- image classification
- visual features
- key frames
- image features
- image representation
- low level features
- multi modal
- video sequences
- bag of words
- meta level
- video shots
- video data
- probabilistic inference
- class specific
- bayesian networks
- feature extraction
- visual words
- bayesian inference
- multimodal interaction
- feature selection
- shot boundary detection
- video clips
- sparse coding
- video content
- feature vectors
- image segmentation