Diverse and Aligned Audio-to-Video Generation via Text-to-Video Model Adaptation.
Guy YarivItai GatSagie BenaimLior WolfIdan SchwartzYossi AdiPublished in: CoRR (2023)
Keyphrases
- multimedia
- video content
- video data
- video sequences
- audio video
- video frames
- video content analysis
- digital video
- real time
- multimedia processing
- scene change detection
- natural language descriptions
- video streams
- information retrieval
- video clips
- video database
- audio content
- multimedia information
- video retrieval
- closed captions
- online video
- lecture videos
- video signals
- multimedia documents
- visual data
- text detection
- video recordings
- video search
- human activities
- audio files
- multi modal
- video material
- signal processing
- multimedia data
- key frames