Sign in

Monkey: Image Resolution and Text Label Are Important Things for Large Multi-modal Models.

Zhang LiBiao YangQiang LiuZhiyin MaShuo ZhangJingxu YangYabo SunYuliang LiuXiang Bai
Published in: CoRR (2023)
Keyphrases
  • multi modal
  • image resolution
  • multi modality
  • video search
  • image quality
  • cross modal
  • image annotation
  • audio visual
  • semantic concepts
  • machine learning
  • high dimensional
  • field of view
  • metadata
  • multiple modalities