Snap Video: Scaled Spatiotemporal Transformers for Text-to-Video Synthesis.
Willi MenapaceAliaksandr SiarohinIvan SkorokhodovEkaterina DeynekaTsai-Shien ChenAnil KagYuwei FangAleksei StoliarElisa RicciJian RenSergey TulyakovPublished in: CoRR (2024)
Keyphrases
- video sequences
- space time
- video clips
- online video
- video content
- multimedia
- video data
- real time video
- video streams
- video analysis
- video database
- video segments
- video images
- real time
- video search
- text detection
- spatiotemporal features
- human actions
- event detection
- video frames
- spatio temporal
- video surveillance
- semantic concepts
- digital video
- video processing
- key frames
- temporal information
- semantic labels
- natural language descriptions
- information retrieval