Login / Signup

RecFormer: Recurrent Multi-modal Transformer with History-Aware Contrastive Learning for Visual Dialog.

Liucun LuJinghui QinZequn JieLin MaLiang LinXiaodan Liang
Published in: PRCV (1) (2023)
Keyphrases
  • multi modal
  • cross modal
  • audio visual
  • auto annotation
  • high dimensional
  • image annotation
  • video search
  • machine learning
  • image analysis
  • visual features