Login / Signup
DialCLIP: Empowering CLIP as Multi-Modal Dialog Retriever.
Zhichao Yin
Binyuan Hui
Min Yang
Fei Huang
Yongbin Li
Published in:
CoRR (2024)
Keyphrases
</>
multi modal
multi modality
video clips
image annotation
video search
audio visual
semantic concepts
high dimensional
cross modal
fusing multiple
image analysis
higher level
low level features
uni modal