A free lunch from ViT: Adaptive Attention Multi-scale Fusion Transformer for Fine-grained Visual Recognition.
Yuan ZhangJian CaoLing ZhangXiangcheng LiuZhiyi WangFeng LingWeiqian ChenPublished in: CoRR (2021)
Keyphrases
- fine grained
- visual recognition
- multiscale
- coarse grained
- image classification
- object recognition
- visual recognition tasks
- access control
- image fusion
- latent topic models
- view independent
- scale space
- image representation
- tightly coupled
- visual categorization
- text classification
- natural images
- energy function
- image processing