ViLBERTScore: Evaluating Image Caption Using Vision-and-Language BERT.
Hwanhee LeeSeunghyun YoonFranck DernoncourtDoo Soon KimTrung BuiKyomin JungPublished in: Eval4NLP (2020)
Keyphrases
- image data
- input image
- single image
- image representation
- image analysis
- multiscale
- template matching
- image features
- image content
- image segmentation
- high resolution
- image regions
- image classification
- programming language
- image collections
- natural language
- image processing
- test images
- visual perception
- image synthesis
- caption text
- vision system
- image retrieval
- computer vision
- image matching
- region of interest
- edge map
- similarity measure
- real time
- keypoints
- segmentation algorithm
- image quality
- language learning
- image pixels
- bounding box