Login / Signup
VLMixer: Unpaired Vision-Language Pre-training via Cross-Modal CutMix.
Teng Wang
Wenhao Jiang
Zhichao Lu
Feng Zheng
Ran Cheng
Chengguo Yin
Ping Luo
Published in:
ICML (2022)
Keyphrases
</>
cross modal
multi modal
computer vision
multimedia retrieval
natural language
training set
visual data
image retrieval
supervised learning
training examples
wordnet
multimedia databases
visual recognition