Login / Signup
Dual-modality seq2seq network for audio-visual event localization.
Yan-Bo Lin
Yu-Jhe Li
Yu-Chiang Frank Wang
Published in:
CoRR (2019)
Keyphrases
</>
audio visual
multi modal
video summarization
multi stream
temporal context
multimedia
visual information
visual data
audio visual speech recognition
audio features
emotion recognition
event detection
three dimensional
search engine
knowledge base
person authentication
multimodal fusion
data sets