ATReSN-Net: Capturing Attentive Temporal Relations in Semantic Neighborhood for Acoustic Scene Classification.
Liwen ZhangJiqing HanZiqiang ShiPublished in: INTERSPEECH (2020)
Keyphrases
- scene classification
- temporal relations
- object recognition
- scene categories
- natural scenes
- image classification
- indoor outdoor
- biologically inspired
- visual words
- spatial relations
- temporal information
- image representation
- temporal reasoning
- bag of features
- visual attention
- semantic information
- natural images
- image features
- similarity measure
- machine learning
- video clips
- knn
- high level
- computer vision