Multi-Modal fusion with multi-level attention for Visual Dialog.
Jingping ZhangQiang WangYahong HanPublished in: Inf. Process. Manag. (2020)
Keyphrases
- multi modal fusion
- selective attention
- visual information
- visual perception
- visual field
- visual attention
- neural network
- high level
- human vision
- conversational agents
- low level
- mixed initiative
- visual cues
- visual analysis
- similarity measure
- data sets
- eye movements
- image classification
- user interface
- feature space
- natural language