Login / Signup

Masked co-attention model for audio-visual event localization.

Hengwei LiuXiaodong Gu
Published in: Appl. Intell. (2024)
Keyphrases
  • audio visual
  • multi modal
  • information retrieval
  • high level
  • search engine
  • training set