Hierarchical Spatio-temporal Decoupling for Text-to-Video Generation.
Zhiwu QingShiwei ZhangJiayu WangXiang WangYujie WeiYingya ZhangChangxin GaoNong SangPublished in: CoRR (2023)
Keyphrases
- spatio temporal
- spatial and temporal
- video representation
- space time
- spatial temporal
- text generation
- human actions
- video data
- video sequences
- video content
- temporal domain
- text detection
- spatio temporally
- video frames
- moving objects
- news video
- real time
- video streams
- video search
- multimedia
- natural language descriptions
- dynamic textures
- video database
- action recognition
- text retrieval
- input output
- information retrieval
- image sequences
- video retrieval
- video analysis
- coarse to fine
- video shots
- text mining
- text documents
- database
- multimedia documents
- video segments
- semantic labels
- temporal segmentation
- human activities
- video collections
- keywords
- digital video
- video surveillance
- spatial and temporal relationships
- spatio temporal data
- video images
- natural language generation
- motion trajectories
- surveillance videos
- video clips
- key frames