DisenStudio: Customized Multi-subject Text-to-Video Generation with Disentangled Spatial Control.
Hong ChenXin WangYipeng ZhangYuwei ZhouZeyang ZhangSiao TangWenwu ZhuPublished in: CoRR (2024)
Keyphrases
- spatial and temporal
- spatial temporal
- spatio temporal
- space time
- text generation
- video content
- text detection
- video sequences
- control system
- video search
- multimedia
- control method
- real time
- temporal correlation
- video frames
- video data
- information retrieval
- video streams
- text documents
- natural language descriptions
- database
- temporal resolution
- keywords
- temporal relationships
- temporal domain
- multimedia search
- lecture videos
- audio content
- video images
- video analysis
- text retrieval
- multimedia data
- video coding
- text mining
- motion estimation