Login / Signup

Audio-Visual Generalized Zero-Shot Learning using Pre-Trained Large Multi-Modal Models.

David KurzendörferOtniel-Bogdan MerceaA. Sophia KoepkeZeynep Akata
Published in: CoRR (2024)
Keyphrases
  • multi modal
  • audio visual
  • audio features
  • pre trained
  • neural network
  • high dimensional
  • computer vision
  • input image
  • image annotation