Login / Signup
RecFormer: Recurrent Multi-modal Transformer with History-Aware Contrastive Learning for Visual Dialog.
Liucun Lu
Jinghui Qin
Zequn Jie
Lin Ma
Liang Lin
Xiaodan Liang
Published in:
PRCV (1) (2023)
Keyphrases
</>
multi modal
cross modal
audio visual
auto annotation
high dimensional
image annotation
video search
machine learning
image analysis
visual features