Open-ended VQA benchmarking of Vision-Language models by exploiting Classification datasets and their semantic hierarchy.

Published in: ICLR (2024)

Keyphrases