OmniVL: One Foundation Model for Image-Language and Video-Language Tasks.
Junke WangDongdong ChenZuxuan WuChong LuoLuowei ZhouYucheng ZhaoYujia XieCe LiuYu-Gang JiangLu YuanPublished in: NeurIPS (2022)
Keyphrases
- specification language
- probabilistic model
- statistical model
- similarity measure
- programming language
- image features
- video sequences
- image data
- language learning
- multiscale
- image classification
- image retrieval
- multimedia
- bayesian framework
- single image
- input image
- probability distribution
- edge detection
- computational model
- segmentation method
- spatial information
- image content
- conceptual model
- natural language
- geometric constraints
- low level