AudioCLIP: Extending CLIP to Image, Text and Audio.
Andrey GuzhovFederico RaueJörn HeesAndreas DengelPublished in: CoRR (2021)
Keyphrases
- image data
- input image
- image content
- multiscale
- image features
- image analysis
- text graphics
- image collections
- low level
- image retrieval
- image segmentation
- high resolution
- region of interest
- image representation
- template matching
- single image
- image matching
- web images
- visual data
- image pixels
- multimedia
- image classification
- edge detection
- feature points
- similarity measure
- scanned documents
- image regions
- segmentation algorithm
- text retrieval
- medical images
- video clips
- pixel values
- image database
- audio visual
- textual descriptions
- information retrieval