Login / Signup
Collaborative Audio-Visual Event Localization Based on Sequential Decision and Cross-Modal Consistency.
Yuqian Kuang
Xiaopeng Fan
Published in:
ICASSP (2023)
Keyphrases
</>
audio visual
cross modal
multi modal
visual data
event detection
high dimensional
image annotation
visual information
image retrieval
multimedia
multimedia data
video data
image data
data analysis
multimedia retrieval
visual similarity
feature extraction