VIXEN: Visual Text Comparison Network for Image Difference Captioning.
Alexander BlackJing ShiYifei FanTu BuiJohn P. CollomossePublished in: CoRR (2024)
Keyphrases
- web images
- multiscale
- image data
- input image
- single image
- image features
- visual features
- image retrieval
- image analysis
- image content
- low level
- image representation
- visual perception
- image classification
- visually similar
- image regions
- visual appearance
- visual data
- region of interest
- image collections
- visual cues
- text information
- human visual
- auto annotation
- image segmentation
- human observers
- visual effects
- keywords
- segmentation method
- image processing
- image search
- segmentation algorithm
- edge detection
- complex background
- image database
- spatial relations
- textual descriptions
- news video
- wireless sensor networks
- visual content
- high level
- scanned documents
- adjacent pixels
- web image search
- low level features