Expanding Language-Image Pretrained Models for General Video Recognition.
Bolin NiHouwen PengMinghao ChenSongyang ZhangGaofeng MengJianlong FuShiming XiangHaibin LingPublished in: CoRR (2022)
Keyphrases
- image features
- object models
- static images
- image data
- image matching
- single image
- input image
- image classification
- partial occlusion
- multiscale
- key frames
- image content
- video data
- recognition rate
- multimedia
- template matching
- visual data
- image analysis
- object recognition
- programming language
- probabilistic model
- image collections
- spatial information
- three dimensional objects
- video images
- random fields
- license plate
- image frames
- video content
- image segmentation
- image regions
- segmentation method
- feature points
- high resolution
- video sequences
- video streams
- segmentation algorithm
- edge detection
- low level
- feature vectors
- image retrieval
- natural language
- road signs
- weakly labeled
- video files