Disentangling Content and Motion for Text-Based Neural Video Manipulation.
Levent KaracanTolga KerimogluIsmail InanTolga BirdalErkut ErdemAykut ErdemPublished in: BMVC (2022)
Keyphrases
- multimedia
- video footage
- space time
- online video
- key frames
- video sequences
- successive frames
- multimedia data
- object motion
- spatial and temporal
- input video
- visual cues
- video frames
- video content
- video data
- moving camera
- motion estimation
- temporal filtering
- motion features
- dynamic scenes
- visual data
- static images
- video segments
- low frame rate
- dynamic textures
- motion capture data
- shot change detection
- image sequences
- neural network
- semantic information
- visual features
- human motion
- multimedia documents
- video streams
- motion analysis
- motion capture
- temporal consistency
- optical flow
- network architecture
- moving objects
- multimedia content
- metadata
- video clips
- textual descriptions
- spatio temporal
- lecture videos
- visual motion
- video scene
- single frame
- reference frame
- video retrieval
- video analysis
- event detection
- camera motion
- motion model
- motion trajectories
- user generated
- video signals
- motion planning
- temporal coherence
- human actions
- camera movement
- temporal continuity
- computer vision
- layered representation
- motion segmentation