Glove-Ing Attention: A Multi-Modal Neural Learning Approach to Image Captioning.
Lars Halvor AnundskåsHina AfridiAdane Nega TarekegnMuhammad Mudassar YaminMohib UllahSaira YaminFaouzi Alaya CheikhPublished in: ICASSP Workshops (2023)
Keyphrases
- multi modal
- neural learning
- image data
- multi modality
- image analysis
- input image
- image content
- image features
- image classification
- uni modal
- audio visual
- auto annotation
- fusing multiple
- low level
- image retrieval
- edge detection
- image annotation
- high dimensional
- multiscale
- image segmentation
- cross modal
- image representation
- image collections
- multi layered
- supervised learning
- segmentation method
- training data
- feature extraction
- neural network