Login / Signup
Multimodal Adaptive Distillation for Leveraging Unimodal Encoders for Vision-Language Tasks.
Zhecan Wang
Noel Codella
Yen-Chun Chen
Luowei Zhou
Xiyang Dai
Bin Xiao
Jianwei Yang
Haoxuan You
Kai-Wei Chang
Shih-Fu Chang
Lu Yuan
Published in:
CoRR (2022)
Keyphrases
</>
vision system
visually guided
computer vision
real time
multi modal
language learning
genetic algorithm
image processing
language processing
natural language
robotic systems
medical images
multiple tasks
video compression
image sequences
information systems
databases