Video-Text Modeling with Zero-Shot Transfer from Contrastive Captioners.
Shen YanTao ZhuZirui WangYuan CaoMi ZhangSoham GhoshYonghui WuJiahui YuPublished in: CoRR (2022)
Keyphrases
- video sequences
- multimedia
- natural language descriptions
- video data
- video streams
- video search
- video segments
- text detection
- real time
- key frames
- video event detection
- multimedia search
- online video
- free text
- video retrieval
- text retrieval
- multimedia data
- semantic information
- text mining
- event detection
- temporal information
- digital video
- multimedia documents
- event recognition
- news video
- object detection
- spatio temporal
- metadata
- computer vision