Augment the Pairs: Semantics-Preserving Image-Caption Pair Augmentation for Grounding-Based Vision and Language Models.
Jingru YiBurak UzkentOana IgnatZili LiAmanmeet GargXiang YuLinda LiuPublished in: WACV (2024)
Keyphrases
- language model
- language modeling
- image content
- image data
- image features
- n gram
- language modelling
- probabilistic model
- document retrieval
- statistical language models
- pairwise
- image retrieval
- low level
- computer vision
- speech recognition
- image representation
- test collection
- key frames
- caption text
- semantic information
- query expansion
- image classification
- smoothing methods
- document collections
- hidden markov models
- similarity measure
- information retrieval