Can Audio Captions Be Evaluated with Image Caption Metrics?
Zelin ZhouZhiling ZhangXuenan XuZeyu XieMengyue WuKenny Q. ZhuPublished in: CoRR (2021)
Keyphrases
- relevance feedback
- image retrieval
- visual features
- image search
- image content
- image data
- image features
- input image
- image collections
- image analysis
- image representation
- image pixels
- single image
- image classification
- region of interest
- multimedia
- visual data
- signal processing
- image regions
- test images
- high resolution
- image segmentation
- edge detection
- segmentation method
- low level
- image set
- multiscale
- similarity measure
- pixel values
- bounding box
- image processing
- textual descriptions
- caption text