Login / Signup
Eyes Wide Shut? Exploring the Visual Shortcomings of Multimodal LLMs.
Shengbang Tong
Zhuang Liu
Yuexiang Zhai
Yi Ma
Yann LeCun
Saining Xie
Published in:
CoRR (2024)
Keyphrases
</>
visual features
cross modal
multi modal
visual information
neural network
visual perception
multimodal information
narrow field of view
low level
wide range
feature space
feature vectors
multimodal data