Login / Signup
Multi-modal Grouping Network for Weakly-Supervised Audio-Visual Video Parsing.
Shentong Mo
Yapeng Tian
Published in:
NeurIPS (2022)
Keyphrases
</>
multi modal
audio visual
weakly supervised
audio features
visual data
multimedia
video data
topic models
relation extraction
natural language
high dimensional
named entities
superpixels
image annotation
video content
computer vision
natural language processing
e learning