Sketch, Ground, and Refine: Top-Down Dense Video Captioning.
Chaorui DengShizhe ChenDa ChenYuan HeQi WuPublished in: CVPR (2021)
Keyphrases
- video data
- video content
- video sequences
- real time
- video frames
- multimedia
- video surveillance
- computational complexity
- high level
- video streams
- video database
- key frames
- real time video
- video processing
- spatial and temporal
- temporal information
- video analysis
- event recognition
- video segmentation
- online video
- event detection
- generative model
- feature vectors
- object recognition
- multiscale
- data sets