Login / Signup
DAVE: A Deep Audio-Visual Embedding for Dynamic Saliency Prediction.
Hamed R. Tavakoli
Ali Borji
Esa Rahtu
Juho Kannala
Published in:
CoRR (2019)
Keyphrases
</>
audio visual
multi modal
visual information
temporal context
multi stream
visual data
data sets
multimedia
emotion recognition
video summarization
audio visual speech recognition
knowledge base
feature extraction
object recognition
domain knowledge
multimodal fusion