HSVLT: Hierarchical Scale-Aware Vision-Language Transformer for Multi-Label Image Classification.
Shuyi OuyangHongyi WangZiwei NiuZhenjia BaiShiao XieYingying XuRuofeng TongYen-Wei ChenLanfen LinPublished in: CoRR (2024)
Keyphrases
- multi label
- image classification
- hierarchical text categorization
- multi label classification
- binary classification
- visual features
- bag of words
- image annotation
- natural language
- classifier training
- multi label learning
- image features
- feature extraction
- multi instance
- text categorization
- computer vision
- image representation
- visual words
- svm classifier
- max margin
- protein function prediction
- class labels
- multiple labels
- multi class
- naive bayes