Training Audio Captioning Models without Audio.
Soham DeshmukhBenjamin ElizaldeDimitra EmmanouilidouBhiksha RajRita SinghHuaming WangPublished in: ICASSP (2024)
Keyphrases
- multimedia
- visual information
- audio video
- audio signals
- structured prediction
- classification models
- audio recordings
- multimedia information
- audio visual
- training algorithm
- visual data
- training process
- neural network
- multimedia data
- statistical models
- training examples
- online learning
- signal processing
- low level
- information retrieval