Login / Signup

PosterLLaVa: Constructing a Unified Multi-modal Layout Generator with LLM.

Tao YangYingmin LuoZhongang QiYang WuYing ShanChang Wen Chen
Published in: CoRR (2024)
Keyphrases
  • multi modal
  • hierarchically organized
  • multi modality
  • audio visual
  • image annotation
  • high dimensional
  • semantic concepts
  • fusing multiple
  • computer vision
  • high level
  • humanoid robot
  • video search