Dense Image Representation with Spatial Pyramid VLAD Coding of CNN for Locally Robust Captioning.
Andrew ShinMasataka YamaguchiKatsunori OhnishiTatsuya HaradaPublished in: CoRR (2016)
Keyphrases
- image representation
- spatial pyramid
- image classification
- bag of words
- multiscale
- image content
- image search
- visual words
- scene recognition
- image features
- image retrieval
- feature space
- sparse coding
- low level features
- bag of visual words
- object recognition
- bag of features
- scene classification
- sparse representation
- image matching
- cbir systems
- object retrieval
- linear combination
- visual vocabulary
- machine learning