AutoTVG: A New Vision-language Pre-training Paradigm for Temporal Video Grounding.
Xing ZhangJiaxi GuHaoyu ZhaoShicong WangHang XuRenjing PeiSongcen XuZuxuan WuYu-Gang JiangPublished in: CoRR (2024)
Keyphrases
- spatial and temporal
- temporal information
- real time
- temporal consistency
- spatial temporal
- space time
- temporal coherence
- video frames
- computer vision
- spatio temporal
- video content
- temporal analysis
- video data
- video streams
- video sequences
- vision system
- temporal segmentation
- temporal correlation
- temporal databases
- training set
- video analysis
- temporal structure
- temporal relationships
- temporal constraints
- temporal reasoning
- programming language
- temporal data
- video database
- dynamic textures
- natural language
- multimedia
- temporal domain
- video surveillance
- video shots
- motion trajectories
- language learning
- supervised learning
- temporal order
- temporal resolution
- linear temporal logic
- video clips
- human actions
- multimedia data
- training samples
- neural network