SPT: Spatial Pyramid Transformer for Image Captioning.
Haonan ZhangPengpeng ZengLianli GaoXinyu LyuJingkuan SongHeng Tao ShenPublished in: IEEE Trans. Circuits Syst. Video Technol. (2024)
Keyphrases
- spatial pyramid
- image classification
- image representation
- input image
- image features
- single image
- class specific
- multiscale
- image content
- image retrieval
- high resolution
- segmentation method
- naive bayes nearest neighbor
- matching scheme
- low level
- spatial information
- visual words
- image regions
- test images
- image collections
- similarity measure
- feature points
- object recognition
- image structure
- object class
- image segmentation
- decision trees
- computer vision