Abstract
Depth estimation is one of the basic and important tasks in 3D vision. Recently, many works have been done in self-supervised depth estimation based on geometric consistency between frames. However, these research works still have difficulties in ill-posed areas, such as occlusion areas and texture-less areas. This work proposed a novel self-supervised monocular depth estimation method based on occlusion mask and edge awareness to overcome these difficulties. The occlusion mask divides the image into two classes, making the training of the network more reasonable. The edge awareness loss function is designed based on the edge obtained by the traditional method, so that the method has strong robustness to various lighting conditions. Furthermore, we evaluated the proposed method on the KITTI datasets. The occlusion mask and edge awareness are both beneficial to find corresponding points in ill-posed areas.
Similar content being viewed by others
References
Tripathi N, Sistu G, Yogamani S (2020) Trained trajectory based automated parking system using visual slam. arXiv preprint arXiv, pp. 1–6
Xu D, Chen Y, Lin C et al (2012) Real-time dynamic gesture recognition system based on depth perception for robot navigation. In: Proceedings of the IEEE conference on robotics and biomimetics, pp. 689–694
Song S, Xiao J (2014) Sliding shapes for 3D object detection in depth images. In: European conference on computer vision, pp. 634–651
Garg R, Bg VK, Carneiro G et al (2016) Unsupervised cnn for single view depth estimation: geometry to the rescue. In: European conference on computer vision (ECCV), pp. 740–756
Li L, Zhang S, Yu X et al (2018) PMSC: patch match-based superpixel cut for accurate stereo matching. IEEE Trans Circuits Syst Video Technol, pp. 679–692
Clevert DA, Unterthiner T, Hochreiter S (2015) Fast and accurate deep network learning by exponential linear units (elus). arXiv preprint arXiv, pp. 1–14
Guizilini V, Li J, Ambrus R et al (2020) Robust semi-supervised monocular depth estimation with reprojected distances. In: Conference on robot learning, pp. 503–512
Godard C, Mac Aodha O, Firman M et al (2019) Digging into self-supervised monocular depth estimation. In: Proceedings of the IEEE international conference on computer vision, pp. 3828–3838
Eigen D, Fergus R (2015) Predicting depth, surface normals and semantic labels with a common multi-scale convolutional architecture. In: Proceedings of the IEEE international conference on computer vision, pp. 2650–2658
Zhou T, Brown M, Snavely N et al (2017) Unsupervised learning of depth and ego-motion from video. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp. 1851–1858
Yang Z, Wang P, Xu W et al (2017) Unsupervised learning of geometry with edge-aware depth-normal consistency. arXiv preprint arXiv, pp. 1–8
Mahjourian R, Wicke M, Angelova A (2018) Unsupervised learning of depth and ego-motion from monocular video using 3d geometric constraints. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp. 5667–5675
Wang C, Miguel Buenaposada J, Zhu R (2018) Learning depth from monocular videos using direct methods. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp. 2022–2030
Ranjan A, Jampani V, Balles L (2019) Competitive collaboration: joint unsupervised learning of depth, camera motion, optical flow and motion segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 12240–12249
Luo C, Yang Z, Wang P (2019) Every pixel counts++: joint learning of geometry and motion with 3d holistic understanding. IEEE Trans Pattern Anal Machine Intell, pp. 2624–2641
Casser V, Pirk S, Mahjourian R et al (2019) Depth prediction without the sensors: leveraging structure for unsupervised learning from monocular videos. In: Proceedings of the AAAI conference on artificial intelligence, pp. 8001–8008
Godard C, Mac Aodha O, Brostow GJ (2017) Unsupervised monocular depth estimation with left-right consistency. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 270–279
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Zhou, S., Zhu, M., Li, Z. et al. Self-supervised monocular depth estimation with occlusion mask and edge awareness. Artif Life Robotics 26, 354–359 (2021). https://doi.org/10.1007/s10015-021-00685-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10015-021-00685-z