Beyond Generic: Enhancing Image Captioning with Real-World Knowledge using Vision-Language Pre-Training Model.
Kanzhi ChengWenpo SongZheng MaWenhao ZhuZixuan ZhuJianbing ZhangPublished in: CoRR (2023)
Keyphrases
- prior knowledge
- bayesian framework
- high level
- multiscale
- statistical model
- image features
- real world
- generic model
- conceptual model
- color vision
- reconstruction method
- single image
- computer vision
- image processing
- similarity measure
- image data
- data sets
- prior model
- specification language
- prior information
- low level
- image representation
- computational model
- image regions
- high resolution
- training examples
- image segmentation
- image classification
- programming language
- probabilistic model
- probability distribution