Self-supervised multi-frame depth estimation with visual-inertial pose transformer and monocular guidance.
Xiang WangHaonan LuoZihang WangJin ZhengXiao BaiPublished in: Inf. Fusion (2024)
Keyphrases
- monocular images
- depth estimation
- multi frame
- super resolution
- image sequences
- depth map
- model based pose estimation
- stereo vision
- pose estimation
- motion estimation
- point correspondences
- low resolution
- stereo matching
- high resolution
- optical flow
- depth information
- dynamic scenes
- scene understanding
- motion analysis
- feature points
- stereo pair
- feature matching
- optic flow
- human body
- human pose
- real scenes
- high quality
- d objects
- stereo camera
- single image
- depth images
- object tracking
- multiscale
- disparity map
- three dimensional
- position and orientation
- stereo images
- closed form
- computer vision
- d scene
- object detection
- vision system
- computationally efficient
- multi view