Sign in

IT: A Large-Scale Dataset towards Multi-Modal Multilingual Instruction Tuning.

Lei LiYuwei YinShicheng LiLiang ChenPeiyi WangShuhuai RenMukai LiYazheng YangJingjing XuXu SunLingpeng KongQi Liu
Published in: CoRR (2023)
Keyphrases
  • multi modal
  • cross modal
  • multi modality
  • audio visual
  • semantic concepts
  • high dimensional
  • low level
  • video search
  • computer vision
  • multimedia
  • object recognition