Multimodal architecture for video captioning with memory networks and an attention mechanism.

Wei Li Dashan Guo Xiangzhong Fang

Published in: Pattern Recognit. Lett. (2018)

Keyphrases

attention mechanism
real time
multimedia
video sequences
video data
multi modal
saliency map
visual attention
video frames
video content
visual attention model
video streams
key frames
video analysis
video surveillance
memory management
human computer interaction
associative memory
high bandwidth
higher order