Dense and Aligned Captions (DAC) Promote Compositional Reasoning in VL Models.
Sivan DovehAssaf ArbelleSivan HararyRoei HerzigDonghyun KimPaola Cascante-BonillaAmit AlfassyRameswar PandaRaja GiryesRogerio FerisShimon UllmanLeonid KarlinskyPublished in: CoRR (2023)