Enhancing Multimodal Large Language Models with Multi-instance Visual Prompt Generator for Visual Representation Enrichment.
Wenliang ZhongWenyi WuQi LiRobert A. BartonBoxin DuShioulin SamKarim BouyarmaneIsmail B. TutarJunzhou HuangPublished in: CoRR (2024)
Keyphrases
- visual representation
- language model
- multi instance
- language modeling
- visual representations
- real valued
- n gram
- probabilistic model
- multi instance learning
- multi label
- user interface
- document retrieval
- information retrieval
- semi supervised learning
- query expansion
- smoothing methods
- multi class
- test collection
- language models for information retrieval
- multi modal
- binary classification
- image understanding
- unsupervised learning
- generative model
- graph cuts
- supervised learning
- high dimensional
- expert systems
- image segmentation