Advanced Multimodal Deep Learning Architecture for Image-Text Matching.

Jinyin Wang Haijing Zhang Yihao Zhong Yingbin Liang Rongwei Ji Yiru Cang

Published in: CoRR (2024)

Keyphrases

deep learning
keypoints
image features
input image
image content
single image
multiscale
image retrieval
text mining
image classification
region of interest
image segmentation
similarity measure
unsupervised learning
image representation
test images
information retrieval
natural images
segmentation method
text classification
training data
bounding box
machine learning