CMA-CLIP: Cross-Modality Attention CLIP for Image-Text Classification.

Huidong Liu Shaoyuan Xu Jinmiao Fu Yang Liu Ning Xie Chien-Chih Wang Bryan Wang Yi Sun

Published in: CoRR (2021)

Keyphrases

text classification
image data
image features
image content
image analysis
image classification
multiscale
single image
input image
video clips
image representation
high resolution
bag of words
image regions
edge detection
image matching
image retrieval
image segmentation
image processing
pixel values
template matching
feature selection
neural network
multi label
spatial information
region of interest
segmentation method
low level features
object recognition
step size