Open-ended VQA benchmarking of Vision-Language models by exploiting Classification datasets and their semantic hierarchy.

Published in: CoRR (2024)

Keyphrases