InfiMM: Advancing Multimodal Understanding with an Open-Sourced Visual Language Model.
Haogeng LiuQuanzeng YouYiqi WangXiaotian HanBohan ZhaiYongfei LiuWentao ChenYiren JianYunzhe TaoJianbo YuanRan HeHongxia YangPublished in: ACL (Findings) (2024)
Keyphrases
- language model
- language modeling
- document retrieval
- n gram
- probabilistic model
- information retrieval
- query expansion
- language modelling
- ad hoc information retrieval
- speech recognition
- retrieval model
- mixture model
- vector space model
- smoothing methods
- test collection
- statistical language models
- visual features
- query terms
- context sensitive
- visual representation
- multimedia
- word error rate
- multi modal
- document ranking
- language model for information retrieval
- visual representations
- dependency structure
- relevance model
- text mining
- information extraction
- statistical language modeling
- language models for information retrieval