Diverse and Aligned Audio-to-Video Generation via Text-to-Video Model Adaptation.
Guy YarivItai GatSagie BenaimLior WolfIdan SchwartzYossi AdiPublished in: AAAI (2024)
Keyphrases
- multimedia
- audio video
- video sequences
- natural language descriptions
- multimedia processing
- audio content
- digital video
- video data
- video content
- online video
- text detection
- video streams
- video segments
- visual data
- content based video retrieval
- digital audio
- video database
- information retrieval
- video shots
- video analysis
- broadcast news
- closed captions
- video files
- signal processing
- real time
- scene change detection
- video content analysis
- video collections
- audio signals
- text to speech
- lecture videos
- video search
- audio visual
- video frames
- visual features
- multi modal
- keywords