Show Me What and Tell Me How: Video Synthesis via Multimodal Conditioning.
Ligong HanJian RenHsin-Ying LeeFrancesco BarbieriKyle OlszewskiShervin MinaeeDimitris N. MetaxasSergey TulyakovPublished in: CVPR (2022)
Keyphrases
- multimedia
- video sequences
- video data
- video content
- real time
- multi modal
- video streams
- video analysis
- video processing
- video retrieval
- real time video
- video frames
- multimodal interaction
- video segmentation
- video shots
- temporal coherence
- possibility theory
- visual data
- story segmentation
- multimodal information
- program synthesis
- multiple modalities
- event recognition
- audio visual
- video images
- compressed video
- quality metrics
- multi party
- video clips
- event detection
- space time
- image sequences
- neural network