CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image Classification.
Chun-Fu ChenQuanfu FanRameswar PandaPublished in: CoRR (2021)
Keyphrases
- image classification
- multiscale
- image representation
- bag of words
- computer vision
- real time
- vision system
- feature extraction
- visual features
- image processing
- visual words
- image features
- sparse coding
- scale space
- multi label
- visual perception
- object recognition
- bag of features
- class specific
- visual attention
- sparse representation
- edge detection
- fuzzy logic
- wavelet transform
- high voltage
- visual field
- power transformers
- focus of attention
- computational vision
- local binary pattern
- coarse to fine
- fault diagnosis
- image registration
- feature selection
- information retrieval