VLRM: Vision-Language Models act as Reward Models for Image Captioning.
Maksim DzabraevAlexander KunitsynAndrei IvaniutaPublished in: CoRR (2024)
Keyphrases
- language model
- probabilistic model
- language modeling
- statistical language models
- image content
- language modelling
- image data
- n gram
- image features
- relevance model
- smoothing methods
- information retrieval
- image classification
- statistical model
- document retrieval
- speech recognition
- image representation
- image retrieval
- query expansion
- translation model
- statistical models
- okapi bm
- machine learning
- test collection
- context sensitive
- fusion method
- ir models
- term dependencies
- low level
- computer vision
- language modeling approaches