Skip to main content

Geometric Constraints for Self-supervised Monocular Depth Estimation on Laparoscopic Images with Dual-task Consistency

  • Conference paper
  • First Online:
Medical Image Computing and Computer Assisted Intervention – MICCAI 2022 (MICCAI 2022)

Abstract

Depth values are essential information to automate surgical robots and achieve Augmented Reality technology for minimally invasive surgery. Although depth-pose self-supervised monocular depth estimation performs impressively for autonomous driving scenarios, it is more challenging to predict accurate depth values for laparoscopic images due to the following two aspects: (i) the laparoscope’s motions contain many rotations, leading to pose estimation difficulties for the depth-pose learning strategy; (ii) the smooth surface reduces photometric error even if the matching pixels are inaccurate between adjacent frames. This paper proposes a novel self-supervised monocular depth estimation for laparoscopic images with geometric constraints. We predict the scene coordinates as an auxiliary task and construct dual-task consistency between the predicted depth maps and scene coordinates under a unified camera coordinate system to achieve pixel-level geometric constraints. We extend the pose estimation into a Siamese process to provide stronger and more balanced geometric constraints in a depth-pose learning strategy by leveraging the order of the adjacent frames in a video sequence. We also design a weight mask for depth estimation based on our consistency to alleviate the interference from predictions with low confidence. The experimental results showed that the proposed method outperformed the baseline on depth and pose estimation. Our code is available at https://github.com/MoriLabNU/GCDepthL.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 44.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 59.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Allan, M., et al.: Stereo correspondence and reconstruction of endoscopic data challenge. arXiv preprint arXiv:2101.01133 (2021)

  2. Bian, J., Li, Z., Wang, N., Zhan, H., Shen, C., Cheng, M.M., Reid, I.: Unsupervised scale-consistent depth and ego-motion learning from monocular video. In: Advances in Neural Information Processing Systems, vol. 32 (2019)

    Google Scholar 

  3. Dai, Q., Patil, V., Hecker, S., Dai, D., Van Gool, L., Schindler, K.: Self-supervised object motion and depth estimation from video. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops (2020)

    Google Scholar 

  4. Furukawa, Y., Curless, B., Seitz, S.M., Szeliski, R.: Towards internet-scale multi-view stereo. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 1434–1441 (2010)

    Google Scholar 

  5. Garg, R., B.G., V.K., Carneiro, G., Reid, I.: Unsupervised CNN for single view depth estimation: geometry to the rescue. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9912, pp. 740–756. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46484-8_45

    Chapter  Google Scholar 

  6. Geis, W.P.: Head-mounted video monitor for global visual access in mini-invasive surgery. Surg. Endosc. 10(7), 768–770 (1996)

    Article  Google Scholar 

  7. Godard, C., Mac Aodha, O., Firman, M., Brostow, G.J.: Digging into self-supervised monocular depth estimation. In: 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 3827–3837. IEEE (2019)

    Google Scholar 

  8. Guizilini, V., Hou, R., Li, J., Ambrus, R., Gaidon, A.: Semantically-guided representation learning for self-supervised monocular depth. In: International Conference on Learning Representations (2020)

    Google Scholar 

  9. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

    Google Scholar 

  10. Huang, B., et al.: Self-supervised generative adversarial network for depth estimation in laparoscopic images. In: de Bruijne, M., et al. (eds.) MICCAI 2021. LNCS, vol. 12904, pp. 227–237. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-87202-1_22

    Chapter  Google Scholar 

  11. Huynh, D.Q.: Metrics for 3D rotations: Comparison and analysis. Journal of Mathematical Imaging and Vision 35(2), 155–164 (2009)

    Article  MathSciNet  Google Scholar 

  12. Hwang, M., et al.: Applying depth-sensing to automated surgical manipulation with a da Vinci robot. In: 2020 International Symposium on Medical Robotics (ISMR), pp. 22–29. IEEE (2020)

    Google Scholar 

  13. Johnston, A., Carneiro, G.: Self-supervised monocular trained depth estimation using self-attention and discrete disparity volume. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4755–4764. IEEE (2020)

    Google Scholar 

  14. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)

  15. Li, W., Hayashi, Y., Oda, M., Kitasaka, T., Misawa, K., Kensaku, M.: Attention Guided Self-supervised Monocular Depth Estimation Based on Joint Depth-pose Loss for Laparoscopic Images. Computer Assisted Radiology and Surgery (2022)

    Google Scholar 

  16. Li, W., Hayashi, Y., Oda, M., Kitasaka, T., Misawa, K., Mori, K.: Spatially variant biases considered self-supervised depth estimation based on laparoscopic videos. Computer Methods in Biomechanics and Biomedical Engineering: Imaging & Visualization, pp. 1–9 (2021)

    Google Scholar 

  17. Li, X., Wang, S., Zhao, Y., Verbeek, J., Kannala, J.: Hierarchical scene coordinate classification and regression for visual localization. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 11980–11989. IEEE (2020)

    Google Scholar 

  18. Li, X., Ylioinas, J., Verbeek, J., Kannala, J.: Scene coordinate regression with angle-based reprojection loss for camera relocalization. In: Proceedings of the European Conference on Computer Vision (ECCV) Workshops, pp. 1–16 (2018)

    Google Scholar 

  19. Lyu, X., et al.: Hr-depth: High resolution self-supervised monocular depth estimation. arXiv preprint arXiv:2012.07356 6 (2020)

  20. Ming, Y., Meng, X., Fan, C., Yu, H.: Deep learning for monocular depth estimation: a review. Neurocomputing 438, 14–33 (2021)

    Article  Google Scholar 

  21. Paszke, A., et al.: Automatic differentiation in pytorch. In: NIPS 2017 Workshop on Autodiff (2017)

    Google Scholar 

  22. Qian, L., Zhang, X., Deguet, A., Kazanzides, P.: ARAMIS: augmented reality assistance for minimally invasive surgery using a head-mounted display. In: Shen, D., Liu, T., Peters, T.M., Staib, L.H., Essert, C., Zhou, S., Yap, P.-T., Khan, A. (eds.) MICCAI 2019. LNCS, vol. 11768, pp. 74–82. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-32254-0_9

    Chapter  Google Scholar 

  23. Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., et al.: Imagenet large scale visual recognition challenge. Int. J. Comput. Vision 115(3), 211–252 (2015)

    Article  MathSciNet  Google Scholar 

  24. Tian, Y., Hu, X.: Monocular depth estimation based on a single image: a literature review. In: Twelfth International Conference on Graphics and Image Processing (ICGIP), vol. 11720, pp. 584–593. International Society for Optics and Photonics, SPIE (2021)

    Google Scholar 

  25. Vecchio, R., MacFayden, B., Palazzo, F.: History of laparoscopic surgery. Panminerva Med. 42(1), 87–90 (2000)

    Google Scholar 

  26. Wang, Z., Bovik, A., Sheikh, H., Simoncelli, E.: Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Process. 13(4), 600–612 (2004)

    Article  Google Scholar 

  27. Zhao, C.Q., Sun, Q.Y., Zhang, C.Z., Tang, Y., Qian, F.: Monocular depth estimation based on deep learning: an overview. Sci. China Technol. Sci. 63(9), 1612–1627 (2020). https://doi.org/10.1007/s11431-020-1582-8

    Article  Google Scholar 

  28. Zhou, T., Brown, M., Snavely, N., Lowe, D.G.: Unsupervised learning of depth and ego-motion from video. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1851–1858 (2017)

    Google Scholar 

Download references

Acknowledgments

The authors are grateful for the support from JST CREST Grant Number JPMJCR20D5; MEXT/JSPS KAKENHI Grant Numbers 17H00867, 26108006, and 21K19898; JSPS Bilateral International Collaboration Grants; and CIBoG program of Nagoya University from the MEXT WISE program.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kensaku Mori .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 180 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Li, W., Hayashi, Y., Oda, M., Kitasaka, T., Misawa, K., Mori, K. (2022). Geometric Constraints for Self-supervised Monocular Depth Estimation on Laparoscopic Images with Dual-task Consistency. In: Wang, L., Dou, Q., Fletcher, P.T., Speidel, S., Li, S. (eds) Medical Image Computing and Computer Assisted Intervention – MICCAI 2022. MICCAI 2022. Lecture Notes in Computer Science, vol 13434. Springer, Cham. https://doi.org/10.1007/978-3-031-16440-8_45

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-16440-8_45

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-16439-2

  • Online ISBN: 978-3-031-16440-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics