Login / Signup
PuzzleVQA: Diagnosing Multimodal Reasoning Challenges of Language Models with Abstract Visual Patterns.
Yew Ken Chia
Vernon Toh Yan Han
Deepanway Ghosal
Lidong Bing
Soujanya Poria
Published in:
CoRR (2024)
Keyphrases
</>
language model
visual patterns
n gram
probabilistic model
generative model
information retrieval
visual features
image features
natural images
test collection
statistical modeling
natural scenes
high level
multi modal
higher level
pattern discovery
image data
visual information
texture synthesis