Login / Signup
Audio-Visual Event Localization via Recursive Fusion by Joint Co-Attention.
Bin Duan
Hao Tang
Wei Wang
Ziliang Zong
Guowei Yang
Yan Yan
Published in:
CoRR (2020)
Keyphrases
</>
audio visual
multimodal fusion
person authentication
multi modal
visual information
multi stream
multimedia
visual data
video summarization
audio visual speech recognition
event detection
emotion recognition
domain knowledge