DeepStack: Deeply Stacking Visual Tokens is Surprisingly Simple and Effective for LMMs.

Published in: CoRR (2024)

Keyphrases