Login / Signup
Less is More: Removing Text-regions Improves CLIP Training Efficiency and Robustness.
Liangliang Cao
Bowen Zhang
Chen Chen
Yinfei Yang
Xianzhi Du
Wencong Zhang
Zhiyun Lu
Yantao Zheng
Published in:
CoRR (2023)
Keyphrases
</>
text detection
text regions
supervised learning
training examples
video clips
training set
face detection