Trainable performance upper bounds for image and video captioning.
Li YaoNicolas BallasKyungHyun ChoJohn R. SmithYoshua BengioPublished in: CoRR (2015)
Keyphrases
- upper bound
- image data
- single image
- image features
- image segmentation
- image frames
- input image
- images and video sequences
- image pixels
- image representation
- feature points
- image analysis
- image retrieval
- multiscale
- lower bound
- video sequences
- image content
- template matching
- multimedia
- video images
- static images
- edge detection
- region of interest
- video frames
- image classification
- image matching
- visual cues
- video data
- key frames
- image collections
- pixel values
- video analysis
- visual data
- image regions
- space time
- pre trained