Login / Signup
Audio-Visual Generalized Zero-Shot Learning using Pre-Trained Large Multi-Modal Models.
David Kurzendörfer
Otniel-Bogdan Mercea
A. Sophia Koepke
Zeynep Akata
Published in:
CoRR (2024)
Keyphrases
</>
multi modal
audio visual
audio features
pre trained
neural network
high dimensional
computer vision
input image
image annotation