A free lunch from ViT: adaptive attention multi-scale fusion Transformer for fine-grained visual recognition.
Yuan ZhangJian CaoLing ZhangXiangcheng LiuZhiyi WangFeng LingWeiqian ChenPublished in: ICASSP (2022)
Keyphrases
- fine grained
- visual recognition
- multiscale
- coarse grained
- image classification
- visual recognition tasks
- object recognition
- access control
- image fusion
- tightly coupled
- visual categorization
- latent topic models
- image segmentation
- natural images
- image representation
- scale space
- visual attention
- machine learning
- object detection
- image retrieval
- similarity measure
- data lineage
- feature extraction
- feature selection