Distilling Vision-Language Pre-training to Collaborate with Weakly-Supervised Temporal Action Localization.
Chen JuKunhao ZhengJinxiang LiuPeisen ZhaoYa ZhangJianlong ChangYanfeng WangQi TianPublished in: CoRR (2022)
Keyphrases
- weakly supervised
- object localization
- object detectors
- relation extraction
- superpixels
- object class
- topic models
- natural language
- computer vision
- training set
- semi supervised
- object detection
- image processing
- supervised learning
- multiscale
- named entities
- bounding box
- training samples
- natural images
- pairwise
- training data