Login / Signup
Text-to-feature diffusion for audio-visual few-shot learning.
Otniel-Bogdan Mercea
Thomas Hummel
A. Sophia Koepke
Zeynep Akata
Published in:
CoRR (2023)
Keyphrases
</>
audio visual
spatio temporal
multi modal
computer vision
multimedia
video sequences
object recognition