Dense and Aligned Captions (DAC) Promote Compositional Reasoning in VL Models.
Sivan DovehAssaf ArbelleSivan HararyRoei HerzigDonghyun KimPaola Cascante-BonillaAmit AlfassyRameswar PandaRaja GiryesRogério FerisShimon UllmanLeonid KarlinskyPublished in: NeurIPS (2023)