Beyond Image-Text Matching: Verb Understanding in Multimodal Transformers Using Guided Masking.
Ivana BenováJana KoseckáMichal GregorMartin TamajkaMarcel VeselýMarián SimkoPublished in: CoRR (2024)
Keyphrases
- template matching
- image matching
- input image
- feature points
- matching process
- image data
- keypoints
- image analysis
- image content
- image features
- image classification
- multiscale
- single image
- feature matching
- image segmentation
- image set
- high resolution
- scene matching
- false matches
- image regions
- edge detection
- low level
- image collections
- pattern matching
- normalized correlation
- image representation
- region of interest
- image retrieval
- matching algorithm
- image pixels
- web images