Login / Signup

Do Vision-Language Transformers Exhibit Visual Commonsense? An Empirical Study of VCR.

Zhenyang LiYangyang GuoKejie WangXiaolin ChenLiqiang NieMohan S. Kankanhalli
Published in: CoRR (2024)
Keyphrases