Performance Improvement of Language-Queried Audio Source Separation Based on Caption Augmentation From Large Language Models for DCASE Challenge 2024 Task 9.
Do Hyun LeeYoonah SongHong Kook KimPublished in: CoRR (2024)
Keyphrases
- language model
- source separation
- audio features
- language modeling
- visual features
- n gram
- probabilistic model
- speech recognition
- blind source separation
- query expansion
- information retrieval
- independent component analysis
- test collection
- audio visual
- multimedia
- signal processing
- denoising
- single channel
- natural language
- semantic information
- feature set
- multi modal
- cross lingual
- news video