Login / Signup
CLIP-TD: CLIP Targeted Distillation for Vision-Language Tasks.
Zhecan Wang
Noel Codella
Yen-Chun Chen
Luowei Zhou
Jianwei Yang
Xiyang Dai
Bin Xiao
Haoxuan You
Shih-Fu Chang
Lu Yuan
Published in:
CoRR (2022)
Keyphrases
</>
video clips
language learning
vision system
computer vision
language processing
programming language
real time
low level features
natural language
neural network
databases
search engine
learning algorithm
markov decision processes
key frames
information retrieval
database
multiple tasks