Login / Signup
BLIVA: A Simple Multimodal LLM for Better Handling of Text-Rich Visual Questions.
Wenbo Hu
Yifan Xu
Yi Li
Weiyue Li
Zeyuan Chen
Zhuowen Tu
Published in:
CoRR (2023)
Keyphrases
</>
high level
open domain
information retrieval
text mining
multimedia
database
semantic content
visual information
web images
medical images
text information
cross modal
visual appearance
key concepts
co occurrence
low level
neural network
data sets