Augment the Pairs: Semantics-Preserving Image-Caption Pair Augmentation for Grounding-Based Vision and Language Models.
Jingru YiBurak UzkentOana IgnatZili LiAmanmeet GargXiang YuLinda LiuPublished in: CoRR (2023)
Keyphrases
- language model
- language modeling
- image data
- image content
- probabilistic model
- pairwise
- information retrieval
- image features
- image classification
- n gram
- caption text
- image retrieval
- image representation
- document retrieval
- similarity measure
- test collection
- image regions
- computer vision
- context sensitive
- semantic information
- vector space
- language modelling
- okapi bm
- language models for information retrieval
- low level
- language model for information retrieval