VIXEN: Visual Text Comparison Network for Image Difference Captioning.
Alexander BlackJing ShiYifei FanTu BuiJohn P. CollomossePublished in: AAAI (2024)
Keyphrases
- web images
- image data
- multiscale
- image features
- input image
- image content
- single image
- visual perception
- visual appearance
- low level
- visual cues
- image representation
- image analysis
- image classification
- image segmentation
- human visual
- pixel values
- visually similar
- image regions
- keypoints
- segmentation method
- text mining
- image retrieval
- information retrieval
- visual effects
- image collections
- region of interest
- visual information
- image processing
- spatial information
- news video
- text information
- semantic information
- visual features
- visual attributes
- scanned documents