Multi-modal preference alignment remedies regression of visual instruction tuning on language model.
Shengzhi LiRongyu LinShichao PeiPublished in: CoRR (2024)
Keyphrases
- multi modal
- language model
- cross modal
- language modeling
- n gram
- probabilistic model
- video search
- retrieval model
- document retrieval
- information retrieval
- ad hoc information retrieval
- mixture model
- query expansion
- multi modality
- single modality
- context sensitive
- speech recognition
- smoothing methods
- test collection
- audio visual
- multiple modalities
- multimedia
- query terms
- visual data
- translation model
- low level
- high dimensional
- image annotation
- high level
- word clouds