Task-Oriented Multi-Modal Mutual Leaning for Vision-Language Models.
Sifan LongZhen ZhaoJunkun YuanZichang TanJiangjiang LiuLuping ZhouShengsheng WangJingdong WangPublished in: CoRR (2023)
Keyphrases
- multi modal
- language model
- language modeling
- n gram
- document retrieval
- speech recognition
- probabilistic model
- information retrieval
- multi modality
- retrieval model
- computer vision
- language modelling
- test collection
- context sensitive
- query expansion
- video search
- smoothing methods
- audio visual
- statistical language models
- relevance model
- high dimensional
- cross modal
- spoken term detection
- language models for information retrieval
- document ranking
- image annotation
- translation model
- pseudo relevance feedback