Login / Signup
Dual-modality Seq2Seq Network for Audio-visual Event Localization.
Yan-Bo Lin
Yu-Jhe Li
Yu-Chiang Frank Wang
Published in:
ICASSP (2019)
Keyphrases
</>
audio visual
multi modal
multi stream
emotion recognition
multimedia
visual information
audio visual speech recognition
video summarization
event detection
person authentication
temporal context
databases
visual data
image classification
text mining
nearest neighbor
high level