Login / Signup
Audiovisual Transformer with Instance Attention for Audio-Visual Event Localization.
Yan-Bo Lin
Yu-Chiang Frank Wang
Published in:
ACCV (6) (2020)
Keyphrases
</>
audio visual
multi modal
visual information
emotion recognition
visual data
multi stream
multimedia
temporal context
audio visual speech recognition
video summarization
person authentication
event detection
hidden markov models
image database
co occurrence
knn
high level
computer vision
data sets