Mix-ViT: Mixing attentive vision transformer for ultra-fine-grained visual categorization.
Xiaohan YuJun WangYang ZhaoYongsheng GaoPublished in: Pattern Recognit. (2023)
Keyphrases
- fine grained
- visual categorization
- visual recognition
- pre attentive
- image classification
- coarse grained
- object categories
- computer vision
- access control
- training examples
- intra class variations
- massively parallel
- visual words
- active learning
- visual attention
- natural language processing
- small number
- image features
- object recognition
- image processing
- object detection