An Image is Worth 1/2 Tokens After Layer 2: Plug-and-Play Inference Acceleration for Large Vision-Language Models.
Liang ChenHaozhe ZhaoTianyu LiuShuai BaiJunyang LinChang ZhouBaobao ChangPublished in: CoRR (2024)
Keyphrases
- language model
- language modeling
- image data
- image features
- image classification
- probabilistic model
- n gram
- document retrieval
- retrieval model
- information retrieval
- image segmentation
- image retrieval
- language modelling
- image content
- image regions
- computer vision
- speech recognition
- image representation
- query expansion
- test collection
- low level
- markov random field
- pseudo relevance feedback
- relevance model
- statistical language models