Show, Attend and Tell: Neural Image Caption Generation with Visual Attention.
Kelvin XuJimmy BaRyan KirosKyunghyun ChoAaron C. CourvilleRuslan SalakhutdinovRichard S. ZemelYoshua BengioPublished in: CoRR (2015)
Keyphrases
- visual attention
- biological vision systems
- visual perception
- salient regions
- visual attention model
- saliency map
- vision system
- attention mechanism
- input image
- image features
- image retrieval
- image classification
- image regions
- stereoscopic images
- image data
- visual search
- eye tracking
- visual saliency detection
- eye movements
- multiscale
- image content
- image segmentation
- image structure
- image representation
- focus of attention
- visual motion
- image collections
- saliency detection
- eye fixations
- visual processing
- spatial relationships
- bayesian framework
- spatial information
- visual saliency
- salient features
- visual input
- semantic information
- visual features
- higher order
- image sequences