Natural Language Supervision for General-Purpose Audio Representations.
Benjamin ElizaldeSoham DeshmukhHuaming WangPublished in: CoRR (2023)
Keyphrases
- general purpose
- natural language
- meaning representations
- special purpose
- domain specific
- human language
- multimedia
- semantic analysis
- active learning
- natural language generation
- natural language processing
- conceptual representation
- natural language interface
- question answering
- knowledge representation
- audio video
- natural language sentences
- signal processing
- visual information
- semantic representation
- semantic interpretation
- application specific
- audio visual
- visual data
- programming language
- representation scheme
- dialogue system
- language processing
- search engine
- emotion recognition
- tightly coupled
- higher level
- text to speech
- information extraction
- image retrieval
- image sequences
- high level
- music score