Song-to-Video Translation: Writing a Video from Song Lyrics Based on Multimodal Pre-training.
Feifei FuZelong SunGuoxing YangXiaolong HeZhiwu LuPublished in: ADMA (2) (2023)
Keyphrases
- audio features
- multimedia
- audio files
- video streams
- video content
- real time
- video data
- video sequences
- video frames
- audio visual
- video clips
- digital audio
- video database
- video analysis
- video surveillance
- human activities
- event detection
- space time
- human computer interaction
- multi modal
- feature set
- classification accuracy
- low level
- music retrieval
- training set
- neural network