ObjCAViT: Improving Monocular Depth Estimation Using Natural Language Models And Image-Object Cross-Attention.
Dylan AutyKrystian MikolajczykPublished in: CoRR (2022)
Keyphrases
- depth estimation
- language model
- depth cues
- stereo vision
- image features
- image data
- depth map
- single image
- input image
- image retrieval
- complex scenes
- depth estimates
- scene understanding
- probabilistic model
- stereo matching
- dynamic scenes
- depth information
- position and orientation
- information retrieval
- image segmentation
- partial occlusion
- image matching
- lighting conditions
- pose estimation
- keypoints
- super resolution
- vision system
- moving objects
- feature selection
- matching algorithm
- feature points
- d objects
- active learning
- high resolution