Not All Frames Are Equal: Weakly-Supervised Video Grounding With Contextual Similarity and Visual Clustering Losses.
Jing ShiJia XuBoqing GongChenliang XuPublished in: CVPR (2019)
Keyphrases
- weakly supervised
- weakly labeled
- video frames
- key frames
- clustering algorithm
- similarity measure
- visual features
- video sequences
- object class
- video data
- superpixels
- k means
- relation extraction
- topic models
- visual information
- semi supervised
- contextual information
- data points
- distance metric
- named entities
- semantic similarity
- moving objects
- unsupervised learning
- conditional random fields
- multi class