Visual Data-Type Understanding does not emerge from Scaling Vision-Language Models.
Vishaal UdandaraoMax F. BurgSamuel AlbanieMatthias BethgePublished in: CoRR (2023)
Keyphrases
- language model
- visual data
- language modeling
- visual information
- n gram
- probabilistic model
- information retrieval
- image data
- computer vision
- audio visual
- multimedia data
- smoothing methods
- visual features
- high dimensional
- contextual information
- video data
- video sequences
- test collection
- high dimensional data
- image content
- human motion
- visual content
- image sequences
- decision trees
- object recognition
- video content
- feature selection