Can Audio Captions Be Evaluated With Image Caption Metrics?
Zelin ZhouZhiling ZhangXuenan XuZeyu XieMengyue WuKenny Q. ZhuPublished in: ICASSP (2022)
Keyphrases
- image data
- multiscale
- image features
- input image
- visual features
- image retrieval
- image analysis
- visual data
- image classification
- image representation
- image segmentation
- test images
- single image
- image collections
- bounding box
- image content
- pixel values
- multimedia
- caption text
- fusion method
- low level
- feature points
- image regions
- segmentation algorithm
- region of interest
- connected components
- visual information
- image pixels
- high resolution
- segmentation method
- news video