MSViT: Training Multiscale Vision Transformers for Image Retrieval.
Xue LiJiong YuShaochen JiangHongchun LuZiyang LiPublished in: IEEE Trans. Multim. (2024)
Keyphrases
- multiscale
- image retrieval
- image representation
- image processing
- image database
- vision system
- image segmentation
- real time
- computer vision
- wavelet transform
- scale space
- image content
- supervised learning
- image annotation
- edge detection
- training phase
- nearest neighbor search
- visual features
- image features
- data sets
- relevance feedback
- training samples
- text retrieval
- hidden markov models
- coarse to fine
- wavelet domain
- shape representation
- learning algorithm