Login / Signup
Curriculum Masking in Vision-Language Pretraining to Maximize Cross Modal Interaction.
Kraig Tou
Zijun Sun
Published in:
NAACL-HLT (2024)
Keyphrases
</>
cross modal
multi modal
computer vision
natural language
multimedia retrieval
multimedia databases
visual recognition
visual similarity
image retrieval
image data
perceptual information
nearest neighbor
image classification