Advanced Multimodal Deep Learning Architecture for Image-Text Matching.
Jinyin WangHaijing ZhangYihao ZhongYingbin LiangRongwei JiYiru CangPublished in: CoRR (2024)
Keyphrases
- deep learning
- keypoints
- image features
- input image
- image content
- single image
- multiscale
- image retrieval
- text mining
- image classification
- region of interest
- image segmentation
- similarity measure
- unsupervised learning
- image representation
- test images
- information retrieval
- natural images
- segmentation method
- text classification
- training data
- bounding box
- machine learning