A Video is Worth 256 Bases: Spatial-Temporal Expectation-Maximization Inversion for Zero-Shot Video Editing.
Maomao LiYu LiTianyu YangYunfei LiuDongxu YueZhihui LinDong XuPublished in: CoRR (2023)
Keyphrases
- spatial temporal
- video editing
- video database
- expectation maximization
- video data
- video shots
- temporal information
- video segmentation
- video camera
- action recognition
- em algorithm
- video sequences
- spatio temporal
- spatial and temporal
- human actions
- video clips
- video analysis
- video retrieval
- video frames
- probabilistic model
- video content
- spatial information
- image processing
- multimedia
- multimedia content
- digital camera
- key frames
- information retrieval
- gaussian mixture model
- motion vectors