Login / Signup

Learning Contextual Tag Embeddings for Cross-Modal Alignment of Audio and Tags.

Xavier FavoryKonstantinos DrossosTuomas VirtanenXavier Serra
Published in: ICASSP (2021)
Keyphrases
  • statistical learning
  • cross modal
  • visual similarity
  • multi modal
  • learning algorithm
  • metadata
  • keywords
  • feature space
  • multimedia databases
  • multimedia retrieval