Login / Signup

Gated Multi-Head Attention Pooling for Weakly Labelled Audio Tagging.

Sixin HongYuexian ZouWenwu Wang
Published in: INTERSPEECH (2020)
Keyphrases
  • visual information
  • real time
  • metadata
  • multimedia
  • signal processing
  • audio visual
  • search engine
  • computer vision
  • e learning
  • multi modal
  • visual data
  • head pose estimation
  • social tagging