Skip to main content
Log in

Self-supervised monocular depth estimation with occlusion mask and edge awareness

  • Original Article
  • Published:
Artificial Life and Robotics Aims and scope Submit manuscript

Abstract

Depth estimation is one of the basic and important tasks in 3D vision. Recently, many works have been done in self-supervised depth estimation based on geometric consistency between frames. However, these research works still have difficulties in ill-posed areas, such as occlusion areas and texture-less areas. This work proposed a novel self-supervised monocular depth estimation method based on occlusion mask and edge awareness to overcome these difficulties. The occlusion mask divides the image into two classes, making the training of the network more reasonable. The edge awareness loss function is designed based on the edge obtained by the traditional method, so that the method has strong robustness to various lighting conditions. Furthermore, we evaluated the proposed method on the KITTI datasets. The occlusion mask and edge awareness are both beneficial to find corresponding points in ill-posed areas.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. Tripathi N, Sistu G, Yogamani S (2020) Trained trajectory based automated parking system using visual slam. arXiv preprint arXiv, pp. 1–6

  2. Xu D, Chen Y, Lin C et al (2012) Real-time dynamic gesture recognition system based on depth perception for robot navigation. In: Proceedings of the IEEE conference on robotics and biomimetics, pp. 689–694

  3. Song S, Xiao J (2014) Sliding shapes for 3D object detection in depth images. In: European conference on computer vision, pp. 634–651

  4. Garg R, Bg VK, Carneiro G et al (2016) Unsupervised cnn for single view depth estimation: geometry to the rescue. In: European conference on computer vision (ECCV), pp. 740–756

  5. Li L, Zhang S, Yu X et al (2018) PMSC: patch match-based superpixel cut for accurate stereo matching. IEEE Trans Circuits Syst Video Technol, pp. 679–692

  6. Clevert DA, Unterthiner T, Hochreiter S (2015) Fast and accurate deep network learning by exponential linear units (elus). arXiv preprint arXiv, pp. 1–14

  7. Guizilini V, Li J, Ambrus R et al (2020) Robust semi-supervised monocular depth estimation with reprojected distances. In: Conference on robot learning, pp. 503–512

  8. Godard C, Mac Aodha O, Firman M et al (2019) Digging into self-supervised monocular depth estimation. In: Proceedings of the IEEE international conference on computer vision, pp. 3828–3838

  9. Eigen D, Fergus R (2015) Predicting depth, surface normals and semantic labels with a common multi-scale convolutional architecture. In: Proceedings of the IEEE international conference on computer vision, pp. 2650–2658

  10. Zhou T, Brown M, Snavely N et al (2017) Unsupervised learning of depth and ego-motion from video. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp. 1851–1858

  11. Yang Z, Wang P, Xu W et al (2017) Unsupervised learning of geometry with edge-aware depth-normal consistency. arXiv preprint arXiv, pp. 1–8

  12. Mahjourian R, Wicke M, Angelova A (2018) Unsupervised learning of depth and ego-motion from monocular video using 3d geometric constraints. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp. 5667–5675

  13. Wang C, Miguel Buenaposada J, Zhu R (2018) Learning depth from monocular videos using direct methods. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp. 2022–2030

  14. Ranjan A, Jampani V, Balles L (2019) Competitive collaboration: joint unsupervised learning of depth, camera motion, optical flow and motion segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 12240–12249

  15. Luo C, Yang Z, Wang P (2019) Every pixel counts++: joint learning of geometry and motion with 3d holistic understanding. IEEE Trans Pattern Anal Machine Intell, pp. 2624–2641

  16. Casser V, Pirk S, Mahjourian R et al (2019) Depth prediction without the sensors: leveraging structure for unsupervised learning from monocular videos. In: Proceedings of the AAAI conference on artificial intelligence, pp. 8001–8008

  17. Godard C, Mac Aodha O, Brostow GJ (2017) Unsupervised monocular depth estimation with left-right consistency. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 270–279

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shi Zhou.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhou, S., Zhu, M., Li, Z. et al. Self-supervised monocular depth estimation with occlusion mask and edge awareness. Artif Life Robotics 26, 354–359 (2021). https://doi.org/10.1007/s10015-021-00685-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10015-021-00685-z

Keywords

Navigation