Login / Signup
Leveraging Foundation models for Unsupervised Audio-Visual Segmentation.
Swapnil Bhosale
Haosen Yang
Diptesh Kanojia
Xiatian Zhu
Published in:
CoRR (2023)
Keyphrases
</>
audio visual
multi modal
image segmentation
multiscale
machine learning
high level
image retrieval
spatio temporal
image database
dimensionality reduction
visual data
temporal context