Exploiting Spatial-Temporal Semantic Consistency for Video Scene Parsing.
Xingjian HeWeining WangZhiyong XuHao WangJie JiangJing LiuPublished in: CoRR (2021)
Keyphrases
- spatial temporal
- video shots
- video scene
- natural language
- semantic concepts
- spatio temporal
- action recognition
- spatial and temporal
- temporal information
- video retrieval
- video analysis
- scene analysis
- video data
- visual features
- semantic information
- visual content
- video sequences
- low level
- multi modal
- video clips
- key frames
- high level
- semantic similarity
- video database
- video frames
- feature space
- spatial information
- human body