SyncFusion: Multimodal Onset-synchronized Video-to-Audio Foley Synthesis.
Marco ComunitàRiccardo F. GramaccioniEmilian PostolacheEmanuele RodolàDanilo ComminielloJoshua D. ReissPublished in: CoRR (2023)
Keyphrases
- multimedia
- media streams
- story segmentation
- audio video
- multimodal information
- multimodal fusion
- scene change detection
- visual data
- audio visual
- video data
- broadcast news
- digital video
- multimedia processing
- video files
- video content analysis
- audio files
- multimedia information
- video content
- audio stream
- video signals
- video copy detection
- digital audio
- audio signals
- video streams
- video sequences
- audio visual content
- video frames
- video analysis
- video retrieval
- multi modal
- video indexing and retrieval
- real time
- audio features
- cross modal
- texture synthesis
- content based video retrieval
- multiple modalities
- visual speech
- music retrieval
- lecture videos
- soccer video
- hidden markov models
- news video
- video database
- video surveillance
- video material
- signal processing
- speech recognition