Fine-tuning with Multi-modal Entity Prompts for News Image Captioning.
Jingjing ZhangShancheng FangZhendong MaoZhiwei ZhangYongdong ZhangPublished in: ACM Multimedia (2022)
Keyphrases
- multi modal
- fine tuning
- image features
- image segmentation
- auto annotation
- input image
- uni modal
- image representation
- image analysis
- multi modality
- image data
- image classification
- multiscale
- image content
- fusing multiple
- high dimensional
- single modality
- segmentation method
- image regions
- image annotation
- low level
- cross modal
- edge detection
- image retrieval
- audio visual
- automatic image annotation
- semantic concepts
- multiple modalities
- web images
- multimedia
- high level
- particle filter
- similarity measure