Natural Language Supervision For General-Purpose Audio Representations.
Benjamin ElizaldeSoham DeshmukhHuaming WangPublished in: ICASSP (2024)
Keyphrases
- general purpose
- natural language
- multimedia
- domain specific
- meaning representations
- human language
- special purpose
- conceptual representation
- application specific
- semantic interpretation
- knowledge representation
- programming language
- audio video
- question answering
- natural language understanding
- active learning
- semantic representations
- language processing
- semantic analysis
- visual information
- higher level
- audio files
- natural language sentences
- neural network
- music score
- speaker identification
- natural language interface
- tightly coupled
- natural language generation
- representation scheme
- audio visual
- multi modal
- natural language processing
- machine learning