Depth as attention to learn image representations for visual localization, using monocular images.
Dulmini HettiarachchiYe TianHan YuShunsuke KamijoPublished in: J. Vis. Commun. Image Represent. (2024)
Keyphrases
- image representation
- monocular images
- depth estimation
- image classification
- pose estimation
- depth map
- multiscale
- human body
- image features
- stereo vision
- visual features
- bag of words
- object recognition
- image retrieval
- human pose
- depth information
- visual information
- real scenes
- feature space
- visual words
- image sequences
- depth images
- scene understanding
- stereo matching
- computer vision
- temporal sequences
- image segmentation
- dynamic scenes
- visual attention
- d scene
- machine learning
- feature extraction