Login / Signup

UniTAB: Unifying Text and Box Outputs for Grounded Vision-Language Modeling.

Zhengyuan YangZhe GanJianfeng WangXiaowei HuFaisal AhmedZicheng LiuYumao LuLijuan Wang
Published in: ECCV (36) (2022)
Keyphrases