Login / Signup
ViLTA: Enhancing Vision-Language Pre-training through Textual Augmentation.
Weihan Wang
Zhen Yang
Bin Xu
Juanzi Li
Yankui Sun
Published in:
CoRR (2023)
Keyphrases
</>
natural language
real time
programming language
computer vision
image processing
language learning
specification language
operational semantics
neural network
information retrieval
multimedia
training process
textual information
conceptual graphs
language processing
visual field