Multimodal Self-Instruct: Synthetic Abstract Image and Visual Reasoning Instruction Using Language Model.
Wenqi ZhangZhenglin ChengYuanyu HeMengna WangYongliang ShenZeqi TanGuiyang HouMingqian HeYanna MaWeiming LuYueting ZhuangPublished in: CoRR (2024)
Keyphrases
- language model
- low level
- n gram
- language modeling
- document retrieval
- image features
- information retrieval
- image content
- image representation
- retrieval model
- image classification
- image retrieval
- speech recognition
- query expansion
- probabilistic model
- test collection
- image collections
- image segmentation
- mixture model
- statistical model
- language model for information retrieval
- visual information
- query terms
- bayesian framework
- visual features
- multi modal