Login / Signup
AV-SAM: Segment Anything Model Meets Audio-Visual Localization and Segmentation.
Shentong Mo
Yapeng Tian
Published in:
CoRR (2023)
Keyphrases
</>
audio visual
data sets
computer vision
multi modal
image sequences
spatio temporal
domain knowledge
video summarization
fully unsupervised