Human-centric Spatio-Temporal Video Grounding With Visual Transformers.

Zongheng Tang Yue Liao Si Liu Guanbin Li Xiaojie Jin Hongxu Jiang Qian Yu Dong Xu

Published in: CoRR (2020)

Keyphrases

human centric
spatio temporal
spatial and temporal
space time
human actions
video sequences
video frames
human centered
visual features
video data
context awareness
multimedia
visual information
human computer interaction
video surveillance
context aware
temporal information
human motion
high level
video content
image sequences
action recognition
pose estimation
intelligent systems
e government