Login / Signup
Learning Contextual Tag Embeddings for Cross-Modal Alignment of Audio and Tags.
Xavier Favory
Konstantinos Drossos
Tuomas Virtanen
Xavier Serra
Published in:
CoRR (2020)
Keyphrases
</>
cross modal
learning algorithm
multimedia
multi modal
perceptual information
visual recognition
visual similarity
learning tasks
metadata
natural language processing
visual information
visual content