Bootstrapped Self-Supervised Training with Monocular Video for Semantic Segmentation and Depth Estimation.
Yihao ZhangJohn J. LeonardPublished in: CoRR (2021)
Keyphrases
- street scenes
- depth estimation
- semantic segmentation
- dynamic scenes
- stereo vision
- depth map
- stereo matching
- depth information
- image sequences
- conditional random fields
- superpixels
- scene understanding
- video sequences
- feature matching
- space time
- multi view
- super resolution
- video frames
- stereo pair
- pose estimation
- real scenes
- training set
- three dimensional
- video surveillance
- object classes
- viewpoint
- image features
- facial expressions
- d scene
- high quality
- higher order