Unifying Cross-Lingual and Cross-Modal Modeling Towards Weakly Supervised Multilingual Vision-Language Pre-training.
Zejun LiZhihao FanJingjing ChenQi ZhangXuanjing HuangZhongyu WeiPublished in: ACL (1) (2023)
Keyphrases
- cross lingual
- weakly supervised
- cross modal
- machine translation
- multi modal
- language modeling
- object class
- computer vision
- topic models
- document clustering
- text classification
- transfer learning
- image retrieval
- training set
- object detection
- natural language
- language model
- supervised learning
- training data
- semi supervised
- knn
- low level
- multiscale
- similarity measure