Login / Signup
STVGFormer: Spatio-Temporal Video Grounding with Static-Dynamic Cross-Modal Understanding.
Zihang Lin
Chaolei Tan
Jian-Fang Hu
Zhi Jin
Tiancai Ye
Wei-Shi Zheng
Published in:
CoRR (2022)
Keyphrases
</>
cross modal
spatio temporal
multi modal
space time
dynamic textures
visual data
video sequences
human actions
multimedia retrieval
multimedia
video data
multimedia databases
video streams
high level
video frames
image retrieval
video content
moving objects
video analysis
visual features
image sequences
low level