Harmonizing Visual and Textual Embeddings for Zero-Shot Text-to-Image Customization.
Yeji SongJimyeong KimWonhark ParkWonsik ShinWonjong RheeNojun KwakPublished in: CoRR (2024)
Keyphrases
- web images
- image data
- low level
- textual information
- image analysis
- visual perception
- input image
- image content
- multiscale
- textual descriptions
- image features
- image classification
- image representation
- single image
- image segmentation
- image retrieval
- visual features
- visual representations
- high resolution
- textual and visual information
- visual appearance
- keywords
- keypoints
- image collections
- visually similar
- visual and textual features
- visual information
- segmentation algorithm
- spatial relations
- test images
- visual data
- auto annotation
- textual data
- information retrieval
- edge detection
- text information
- user interface
- metadata
- image sequences
- visual attributes
- high dimensional
- visual similarity
- image quality
- image regions
- visual content