Login / Signup
DeepStack: Deeply Stacking Visual Tokens is Surprisingly Simple and Effective for LMMs.
Lingchen Meng
Jianwei Yang
Rui Tian
Xiyang Dai
Zuxuan Wu
Jianfeng Gao
Yu-Gang Jiang
Published in:
CoRR (2024)
Keyphrases
</>
neural network
computer vision
highly reliable
visual features
low level
image sequences
face recognition
bayesian networks
expert systems
artificial neural networks
high level
text classification
learning algorithm
visual information
visual data
visual perception
data sets