Multi-level network based on transformer encoder for fine-grained image-text matching.
Lei YangYong FengMingliang ZhouXiancai XiongYongheng WangBaohua QiangPublished in: Multim. Syst. (2023)
Keyphrases
- fine grained
- coarse grained
- image matching
- template matching
- image features
- image content
- keypoints
- multiscale
- image data
- input image
- feature points
- access control
- matching process
- image classification
- image retrieval
- image segmentation
- tightly coupled
- image set
- decoding process
- false matches
- matching algorithm
- bit rate
- graphical models
- web search
- text mining
- information retrieval