Human-centric Spatio-Temporal Video Grounding With Visual Transformers.
Zongheng TangYue LiaoSi LiuGuanbin LiXiaojie JinHongxu JiangQian YuDong XuPublished in: CoRR (2020)
Keyphrases
- human centric
- spatio temporal
- spatial and temporal
- space time
- human actions
- video sequences
- video frames
- human centered
- visual features
- video data
- context awareness
- multimedia
- visual information
- human computer interaction
- video surveillance
- context aware
- temporal information
- human motion
- high level
- video content
- image sequences
- action recognition
- pose estimation
- intelligent systems
- e government