Zero-Shot Audio Captioning via Audibility Guidance.
Tal ShaharabanyAriel ShaulovLior WolfPublished in: CoRR (2023)
Keyphrases
- multimedia
- audio video
- signal processing
- visual information
- cross modal
- audio visual
- multimedia information
- emotion recognition
- neural network
- soccer video
- music information retrieval
- audio recordings
- digital audio
- audio signals
- digital video
- visual data
- object categories
- multi modal
- knowledge base
- computer vision
- genetic algorithm
- machine learning