Enhancing image captioning with depth information using a Transformer-based framework.
Aya Mahmoud AhmedMohamed YousefKhaled F. HussainYoussef Bassyouni MahdyPublished in: CoRR (2023)
Keyphrases
- depth information
- image based rendering
- image data
- depth map
- input image
- stereo vision
- shape from focus
- depth images
- high resolution
- multiscale
- depth cues
- image features
- disparity map
- depth data
- image retrieval
- depth recovery
- single image
- motion parallax
- view interpolation
- similarity measure
- image matching
- data sets
- multi view images