Sign in

Zero and R2D2: A Large-scale Chinese Cross-modal Benchmark and A Vision-Language Framework.

Chunyu XieHeng CaiJianfei SongJincheng LiFanjing KongXiaoyu WuHenrique MorimitsuLin YaoDexin WangDawei LengXiangyang JiYafeng Deng
Published in: CoRR (2022)
Keyphrases
  • cross modal
  • multi modal
  • visual features
  • computer vision
  • nearest neighbor
  • information retrieval systems