Collaborative Static and Dynamic Vision-Language Streams for Spatio-Temporal Video Grounding.
Zihang LinChaolei TanJian-Fang HuZhi JinTiancai YeWei-Shi ZhengPublished in: CVPR (2023)
Keyphrases
- spatio temporal
- real time
- spatial and temporal
- spatial and temporal relationships
- spatial temporal
- space time
- video representation
- human actions
- computer vision
- video sequences
- video files
- video data
- programming language
- vision system
- spatio temporally
- moving objects
- temporal domain
- video content
- video streams
- multimedia
- language learning
- video database
- video copy detection
- temporal segmentation
- digital video
- image sequences
- video retrieval
- natural language
- video frames
- video surveillance
- dynamic textures
- temporal relationships
- temporal structure
- temporal information
- data streams
- collaborative learning
- motion patterns
- key frames
- multi stream
- image processing