Behind the Scene: Revealing the Secrets of Pre-trained Vision-and-Language Models.
Jize CaoZhe GanYu ChengLicheng YuYen-Chun ChenJingjing LiuPublished in: CoRR (2020)
Keyphrases
- language model
- pre trained
- language modeling
- n gram
- training data
- probabilistic model
- speech recognition
- retrieval model
- statistical language models
- language modelling
- single image
- computer vision
- information retrieval
- training examples
- video sequences
- query expansion
- vision system
- real scenes
- smoothing methods
- control signals
- appearance variations
- language models for information retrieval
- input image
- moving objects
- image sequences
- multi modal
- visual information
- feature space