Login / Signup

Groma: Localized Visual Tokenization for Grounding Multimodal Large Language Models.

Chuofan MaYi JiangJiannan WuZehuan YuanXiaojuan Qi
Published in: CoRR (2024)
Keyphrases