An Attention-Based Time-Frequency Pyramid Pooling Strategy in Deep Convolutional Networks for Acoustic Scene Classification.
Pengxu JiangYang YangCairong ZouQingyun WangPublished in: IEEE Signal Process. Lett. (2024)
Keyphrases
- scene classification
- spatial pyramid matching
- image representation
- object recognition
- natural scenes
- image classification
- visual words
- sparse coding
- biologically inspired
- deep learning
- indoor outdoor
- scene recognition
- bag of words
- visual attention
- bag of visual words
- spatial layout
- bag of features
- multiscale
- scene representation
- multi instance multi label learning
- image content
- multiresolution
- natural images
- image processing
- computer vision
- saliency map
- unsupervised learning
- multi modal
- co occurrence