Login / Signup
Pink: Unveiling the Power of Referential Comprehension for Multi-modal LLMs.
Shiyu Xuan
Qingpei Guo
Ming Yang
Shiliang Zhang
Published in:
CoRR (2023)
Keyphrases
</>
multi modal
multi modality
high dimensional
audio visual
cross modal
image annotation
computer vision
video sequences
visual information
computer assisted
humanoid robot