Aligning Where to See and What to Tell: Image Captioning with Region-Based Attention and Scene-Specific Contexts.
Kun FuJunqi JinRunpeng CuiFei ShaChangshui ZhangPublished in: IEEE Trans. Pattern Anal. Mach. Intell. (2017)
Keyphrases
- input image
- single image
- image segmentation
- image regions
- image data
- scene images
- image classification
- multiple objects
- image features
- image representation
- reference images
- image retrieval
- complex scenes
- image content
- scene understanding
- scene matching
- image frames
- imaging process
- scene classification
- geometric information
- image set
- d scene
- multiscale
- low level
- image sequences
- intensity images
- image matching
- edge detection
- three dimensional
- multiple images
- location and orientation
- real world scenes
- scene geometry
- feature points
- video sequences
- image motion
- real scenes
- pixel level
- dynamic scenes