Login / Signup
Enhance audio-visual segmentation with hierarchical encoder and audio guidance.
Cunhan Guo
Heyan Huang
Yanghao Zhou
Published in:
Neurocomputing (2024)
Keyphrases
</>
audio visual
multi modal
temporal segmentation
visual information
visual data
emotion recognition
multi stream
audio visual speech recognition
audio features
audio visual content
multimodal fusion
video scene
multimedia
image regions
action recognition
co occurrence
text mining
multiscale
feature selection
data sets