Why are Visually-Grounded Language Models Bad at Image Classification?
Yuhui ZhangAlyssa UnellXiaohan WangDhruba GhoshYuchang SuLudwig SchmidtSerena Yeung-LevyPublished in: CoRR (2024)
Keyphrases
- language model
- image classification
- language modeling
- n gram
- probabilistic model
- bag of words
- document retrieval
- information retrieval
- speech recognition
- feature extraction
- language modelling
- image features
- image representation
- retrieval model
- context sensitive
- visual features
- query expansion
- statistical language models
- test collection
- translation model
- multi label
- pseudo relevance feedback
- vector space model
- word error rate
- ad hoc information retrieval
- smoothing methods
- language models for information retrieval
- term dependencies
- passage retrieval
- relevance model
- cross lingual
- visual words