Semantic Proximity Alignment: Towards Human Perception-consistent Audio Tagging by Aligning with Label Text Description.
Wuyang LiuYanzhen RenPublished in: CoRR (2023)
Keyphrases
- human perception
- semantic labels
- semantic information
- text graphics
- human visual system
- high level
- trademark images
- visual information
- textual descriptions
- text mining
- natural language
- metadata
- information retrieval
- low level features
- database
- keywords
- natural language processing
- three dimensional
- semantic similarity