Spatio-Temporal Attention Pooling for Audio Scene Classification.
Huy PhanOliver Y. ChénLam Dang PhamPhilipp KochMaarten De VosIan McLoughlinAlfred MertinsPublished in: INTERSPEECH (2019)
Keyphrases
- scene classification
- spatio temporal
- object recognition
- natural scenes
- image classification
- spatial pyramid matching
- indoor outdoor
- visual words
- visual attention
- biologically inspired
- image representation
- scene recognition
- bag of features
- multimedia
- bag of visual words
- spatial and temporal
- image sequences
- moving objects
- scene representation
- saliency map
- natural images
- spatial layout
- computer vision
- multi instance multi label learning
- human motion
- visual information
- human actions
- bag of words
- keypoints
- feature extraction