Login / Signup
QUALIFIER: Question-Guided Self-Attentive Multimodal Fusion Network for Audio Visual Scene-Aware Dialog.
Muchao Ye
Quanzeng You
Fenglong Ma
Published in:
WACV (2022)
Keyphrases
</>
multimodal fusion
visual scene
audio visual
visual attention
high robustness
visual information
image sequences
relevance feedback
multimedia
eye tracking
natural images
user interface
natural scenes
gait recognition
three dimensional
higher level
object recognition
video sequences
search engine