CMMD: Contrastive Multi-Modal Diffusion for Video-Audio Conditional Modeling.
Ruihan YangHannes GamperSebastian BraunPublished in: CoRR (2023)
Keyphrases
- multi modal
- audio visual
- cross modal
- semantic concepts
- video search
- multimedia
- audio video
- single modality
- multi modality
- multiple modalities
- high dimensional
- audio features
- video sequences
- image annotation
- visual data
- video data
- video streams
- broadcast news
- video content
- image processing
- video shots
- video database
- human actions
- audio files
- visual information