Perceive, Reason, and Align: Context-guided cross-modal correlation learning for image-text retrieval.
Zheng LiuXinlei PeiShanshan GaoChanghao LiJingyao WangJunhao XuPublished in: Appl. Soft Comput. (2024)
Keyphrases
- text retrieval
- image retrieval
- cross modal
- multimedia retrieval
- perceptual information
- image data
- image classification
- learning process
- visual recognition
- image content
- image features
- retrieval systems
- document retrieval
- image representation
- image regions
- retrieval model
- user queries
- image collections
- document collections
- low level