Sign in

Lost in Translation: When GPT-4V(ision) Can't See Eye to Eye with Text. A Vision-Language-Consistency Analysis of VLLMs and Beyond.

Xiang ZhangSenyu LiZijun WuNing Shi
Published in: CoRR (2023)
Keyphrases