VideoFactory: Swap Attention in Spatiotemporal Diffusions for Text-to-Video Generation.
Wenjing WangHuan YangZixi TuoHuiguo HeJunchen ZhuJianlong FuJiaying LiuPublished in: CoRR (2023)
Keyphrases
- space time
- spatial and temporal
- text generation
- video sequences
- spatiotemporal features
- natural language descriptions
- video search
- video frames
- video data
- text detection
- video content
- information retrieval
- video segments
- text retrieval
- multimedia
- text mining
- news video
- free text
- multimedia documents
- video streams
- graph cuts
- multimedia data
- visual saliency
- spatio temporal
- database
- moving objects
- dynamic textures
- video analysis
- dynamic scenes
- video clips
- video retrieval
- video surveillance
- natural language processing
- event detection
- saliency map
- information extraction
- natural language generation
- video images
- text information
- tv programs
- diffusion processes
- keywords
- temporal information
- text classification