OmChat: A Recipe to Train Multimodal Language Models with Strong Long Context and Video Understanding.
Tiancheng ZhaoQianqian ZhangKyusong LeePeng LiuLu ZhangChunxin FangJiajia LiaoKelei JiangYibo MaRuochen XuPublished in: CoRR (2024)
Keyphrases
- language model
- language modeling
- context sensitive
- n gram
- probabilistic model
- document retrieval
- speech recognition
- statistical language models
- query expansion
- test collection
- video data
- language modelling
- ad hoc information retrieval
- video sequences
- multimedia
- smoothing methods
- information retrieval
- language model for information retrieval
- probabilistic retrieval models
- retrieval model
- multi modal
- feature selection
- video content
- video frames
- language models for information retrieval
- query terms
- video retrieval
- vector space model
- audio visual
- error rate
- document length
- hidden markov models
- machine learning