Login / Signup
Span-based Audio-Visual Localization.
Yiling Wu
Xinfeng Zhang
Yaowei Wang
Qingming Huang
Published in:
ACM Multimedia (2022)
Keyphrases
</>
audio visual
multi modal
visual information
visual data
emotion recognition
temporal context
high dimensional
audio visual speech recognition
multi stream
multimedia
video summarization
person authentication
multimodal fusion
databases
visual features
feature selection
information retrieval