Align and Prompt: Video-and-Language Pre-training with Entity Prompts.
Dongxu LiJunnan LiHongdong LiJuan Carlos NieblesSteven C. H. HoiPublished in: CVPR (2022)
Keyphrases
- video sequences
- programming language
- multimedia
- video data
- real time
- video database
- natural language
- training examples
- video frames
- video analysis
- named entities
- language learning
- video content
- supervised learning
- video images
- test set
- video streams
- specification language
- multimedia data
- key frames
- training phase
- pre trained
- training algorithm
- training process
- video retrieval
- spatial and temporal
- training samples
- training set