Shallow Convolution-Augmented Transformer with Differentiable Neural Computer for Low-Complexity Classification of Variable-Length Acoustic Scene.
Soonshin SeoDonghyun LeeJi-Hwan KimPublished in: Interspeech (2021)
Keyphrases
- low complexity
- variable length
- fixed length
- feature vectors
- computational complexity
- motion estimation
- bit plane
- feature space
- n gram
- text classification
- distributed video coding
- bitstream
- video sequences
- feature extraction
- image processing
- information extraction
- coding scheme
- image sequences
- three dimensional
- machine learning