Skip to main content

Knowledge Graphs Meet Geometry for Semi-supervised Monocular Depth Estimation

  • Conference paper
  • First Online:
Knowledge Science, Engineering and Management (KSEM 2020)

Abstract

Depth estimation from a single image plays an important role in computer vision. Using semantic information for depth estimation becomes a research hotspot. The traditional neural network-based semantic method only divides the image according to the features, and cannot understand the deep background knowledge about the real world. In recent years, the knowledge graph is proposed and used for model semantic knowledge. In this paper, we enhance the traditional depth prediction method by analyzing the semantic information of the image through the knowledge graph. Background knowledge from the knowledge graph is used to enhance the results of semantic segmentation, and further improve the depth estimation results. We conducted experiments on the KITTI driving dataset, and the results showed that our method outperformed the previous unsupervised learning methods and supervised learning methods. The result of the Apollo dataset demonstrates that our method can perform in the common case.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Brazil, G., Yin, X., Liu, X.: Illuminating pedestrians via simultaneous detection & segmentation. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 4950–4959 (2017)

    Google Scholar 

  2. Chen, L.C., Hermans, A., Papandreou, G., Schroff, F., Wang, P., Adam, H.: MaskLab: instance segmentation by refining object detection with semantic and direction features. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4013–4022 (2018)

    Google Scholar 

  3. Cordts, M., et al.: The cityscapes dataset for semantic urban scene understanding. In: Computer Vision and Pattern Recognition, pp. 3213–3223 (2016)

    Google Scholar 

  4. Eigen, D., Puhrsch, C., Fergus, R.: Depth map prediction from a single image using a multi-scale deep network. In: Advances in Neural Information Processing Systems, pp. 2366–2374 (2014)

    Google Scholar 

  5. Fang, Y., Kuan, K., Lin, J., Tan, C., Chandrasekhar, V.: Object detection meets knowledge graphs. In: International Joint Conference on Artificial Intelligence (2017)

    Google Scholar 

  6. Garg, R., B.G., V.K., Carneiro, G., Reid, I.: Unsupervised CNN for single view depth estimation: geometry to the rescue. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9912, pp. 740–756. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46484-8_45

    Chapter  Google Scholar 

  7. Geiger, A.: Are we ready for autonomous driving? The KITTI vision benchmark suite. In: Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2012)

    Google Scholar 

  8. Godard, C., Aodha, O.M., Brostow, G.J.: Unsupervised monocular depth estimation with left-right consistency. In: Computer Vision and Pattern Recognition, pp. 6602–6611 (2017)

    Google Scholar 

  9. Hariharan, B., Arbeláez, P., Girshick, R., Malik, J.: Simultaneous detection and segmentation. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8695, pp. 297–312. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10584-0_20

    Chapter  Google Scholar 

  10. Huang, J., et al.: Speed/accuracy trade-offs for modern convolutional object detectors. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7310–7311 (2017)

    Google Scholar 

  11. Lee, C.W., Fang, W., Yeh, C.K., Frank Wang, Y.C.: Multi-label zero-shot learning with structured knowledge graphs. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1576–1585 (2018)

    Google Scholar 

  12. Liu, F., Shen, C., Lin, G., Reid, I.: Learning depth from single monocular images using deep convolutional neural fields. IEEE Trans. Pattern Anal. Mach. Intell. 38(10), 2024–2039 (2016)

    Article  Google Scholar 

  13. Mahjourian, R., Wicke, M., Angelova, A.: Unsupervised learning of depth and ego-motion from monocular video using 3D geometric constraints. In: Computer Vision and Pattern Recognition, pp. 5667–5675 (2018)

    Google Scholar 

  14. Mayer, N., et al.: A large dataset to train convolutional networks for disparity, optical flow, and scene flow estimation. In: Computer Vision and Pattern Recognition, pp. 4040–4048 (2016)

    Google Scholar 

  15. Murartal, R., Montiel, J.M.M., Tardos, J.D.: ORB-SLAM: a versatile and accurate monocular SLAM system. IEEE Trans. Robot. 31(5), 1147–1163 (2015)

    Article  Google Scholar 

  16. Ramirez, P.Z., Poggi, M., Tosi, F., Mattoccia, S., Di Stefano, L.: Geometry meets semantics for semi-supervised monocular depth estimation. In: Jawahar, C.V., Li, H., Mori, G., Schindler, K. (eds.) ACCV 2018. LNCS, vol. 11363, pp. 298–313. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-20893-6_19

  17. Tateno, K., Tombari, F., Laina, I., Navab, N.: CNN-SLAM: real-time dense monocular SLAM with learned depth prediction. In: Computer Vision and Pattern Recognition, pp. 6565–6574 (2017)

    Google Scholar 

  18. Wang, P., Huang, X., Cheng, X., Zhou, D., Geng, Q., Yang, R.: The ApolloScape open dataset for autonomous driving and its application. IEEE Trans. Pattern Anal. Mach. Intell. (2019)

    Google Scholar 

  19. Wang, X., Wang, S., Xin, Y., Yang, Y., Li, J., Wang, X.: Distributed Pregel-based provenance-aware regular path query processing on RDF knowledge graphs. World Wide Web, 1–32 (2019)

    Google Scholar 

  20. Wang, Z., Bovik, A.C., Sheikh, H.R., Simoncelli, E.P.: Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Process. 13(4), 600–612 (2004)

    Article  Google Scholar 

  21. Yang, Y., Hallman, S., Ramanan, D., Fowlkes, C.C.: Layered object models for image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 34(9), 1731–1743 (2011)

    Article  Google Scholar 

  22. Liu, Z., Jiang, Z., Feng, W., Feng, H.: OD-GCN: object detection boosted by knowledge GCN. arXiv: Computer Vision and Pattern Recognition (2019)

    Google Scholar 

  23. Zhou, T., Brown, M., Snavely, N., Lowe, D.G.: Unsupervised learning of depth and ego-motion from video. In: Computer Vision and Pattern Recognition, pp. 6612–6619 (2017)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Fusheng Jin .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zhao, Y., Jin, F., Wang, M., Wang, S. (2020). Knowledge Graphs Meet Geometry for Semi-supervised Monocular Depth Estimation. In: Li, G., Shen, H., Yuan, Y., Wang, X., Liu, H., Zhao, X. (eds) Knowledge Science, Engineering and Management. KSEM 2020. Lecture Notes in Computer Science(), vol 12274. Springer, Cham. https://doi.org/10.1007/978-3-030-55130-8_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-55130-8_4

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-55129-2

  • Online ISBN: 978-3-030-55130-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics