Skip to main content

Boosting LightWeight Depth Estimation via Knowledge Distillation

  • Conference paper
  • First Online:
Knowledge Science, Engineering and Management (KSEM 2023)

Abstract

Monocular depth estimation (MDE) methods are often either too computationally expensive or not accurate enough due to the trade-off between model complexity and inference performance. In this paper, we propose a lightweight network that can accurately estimate depth maps using minimal computing resources. We achieve this by designing a compact model that maximally reduces model complexity. To improve the performance of our lightweight network, we adopt knowledge distillation (KD) techniques. We consider a large network as an expert teacher that accurately estimates depth maps on the target domain. The student, which is the lightweight network, is then trained to mimic the teacher’s predictions. However, this KD process can be challenging and insufficient due to the large model capacity gap between the teacher and the student. To address this, we propose to use auxiliary unlabeled data to guide KD, enabling the student to better learn from the teacher’s predictions. This approach helps fill the gap between the teacher and the student, resulting in improved data-driven learning. The experiments show that our method achieves comparable performance to state-of-the-art methods while using only 1% of their parameters. Furthermore, our method outperforms previous lightweight methods regarding inference accuracy, computational efficiency, and generalizability.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Aleotti, F., Zaccaroni, G., Bartolomei, L., Poggi, M., Tosi, F., Mattoccia, S.: Real-time single image depth perception in the wild with handheld devices. Sensors (Basel, Switzerland) 21(1), 15 (2021)

    Google Scholar 

  2. Bau, D., Zhou, B., Khosla, A., Oliva, A., Torralba, A.: Network dissection: quantifying interpretability of deep visual representations. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3319–3327 (2017)

    Google Scholar 

  3. Chen, T., et al.: Improving monocular depth estimation by leveraging structural awareness and complementary datasets. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12359, pp. 90–108. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58568-6_6

    Chapter  Google Scholar 

  4. Chen, X., Zha, Z.: Structure-aware residual pyramid network for monocular depth estimation. In: Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence (IJCAI), pp. 694–700 (2019)

    Google Scholar 

  5. Czarnowski, J., Laidlow, T., Clark, R., Davison, A.: Deepfactors: real-time probabilistic dense monocular SLAM. IEEE Robot. Autom. Lett. 5, 721–728 (2020)

    Article  Google Scholar 

  6. Dai, A., Chang, A.X., Savva, M., Halber, M., Funkhouser, T., Nießner, M.: Scannet: richly-annotated 3D reconstructions of indoor scenes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2432–2443 (2017)

    Google Scholar 

  7. Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 248–255 (2009)

    Google Scholar 

  8. Eigen, D., Puhrsch, C., Fergus, R.: Depth map prediction from a single image using a multi-scale deep network. In: Advances in Neural Information Processing Systems (NIPS), pp. 2366–2374 (2014)

    Google Scholar 

  9. Fu, H., Gong, M., Wang, C., Batmanghelich, K., Tao, D.: Deep ordinal regression network for monocular depth estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2002–2011 (2018)

    Google Scholar 

  10. Guo, X., Hu, J., Chen, J., Deng, F., Lam, T.L.: Semantic histogram based graph matching for real-time multi-robot global localization in large scale environment. IEEE Robot. Autom. Lett. 6(4), 8349–8356 (2021)

    Article  Google Scholar 

  11. Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7132–7141 (2018)

    Google Scholar 

  12. Hu, J., Guo, X., Chen, J., Liang, G., Deng, F., Lam, T.L.: A two-stage unsupervised approach for low light image enhancement. IEEE Robot. Autom. Lett. 6(4), 8363–8370 (2021)

    Article  Google Scholar 

  13. Hu, J., Ozay, M., Zhang, Y., Okatani, T.: Revisiting single image depth estimation: Toward higher resolution maps with accurate object boundaries. In: IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 1043–1051 (2019)

    Google Scholar 

  14. Iro, L., Christian, R., Vasileios, B., Federico, T., Nassir, N.: Deeper depth prediction with fully convolutional residual networks. In: International Conference on 3D Vision (3DV), pp. 239–248 (2016)

    Google Scholar 

  15. Liu, J., Li, Q., Cao, R., Tang, W., Qiu, G.: MiniNet: an extremely lightweight convolutional neural network for real-time unsupervised monocular depth estimation. ArXiv abs/2006.15350 (2020)

    Google Scholar 

  16. Mancini, M., Costante, G., Valigi, P., Ciarfuglia, T.A.: Fast robust monocular depth estimation for obstacle detection with fully convolutional networks. In: IEEE International Conference on Intelligent Robots and Systems (IROS), pp. 4296–4303 (2016)

    Google Scholar 

  17. Mendes, R.Q., Ribeiro, E.G., Rosa, N.S., Grassi, V.: On deep learning techniques to boost monocular depth estimation for autonomous navigation. Robot. Auton. Syst. 136, 103701 (2021)

    Article  Google Scholar 

  18. Mendes, R.D.Q., Ribeiro, E.G., Rosa, N.D.S., Grassi Jr, V.: On deep learning techniques to boost monocular depth estimation for autonomous navigation. arXiv preprint arXiv:2010.06626 (2020)

  19. Nekrasov, V., Dharmasiri, T., Spek, A., Drummond, T., Shen, C., Reid, I.: Real-time joint semantic segmentation and depth estimation using asymmetric annotations. In: IEEE International Conference on Robotics and Automation (ICRA), pp. 7101–7107 (2019)

    Google Scholar 

  20. Pilzer, A., Lathuilière, S., Sebe, N., Ricci, E.: Refine and distill: dxploiting cycle-inconsistency and knowledge distillation for unsupervised monocular depth estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 9760–9769 (2019)

    Google Scholar 

  21. Romero, A., Ballas, N., Kahou, S., Chassang, A., Gatta, C., Bengio, Y.: FitNets: hints for thin deep nets. In: International Conference on Representation Learning (ICLR) (2015)

    Google Scholar 

  22. Silberman, N., Hoiem, D., Kohli, P., Fergus, R.: Indoor segmentation and support inference from RGBD images. In: European Conference on Computer Vision (ECCV), vol. 7576, pp. 746–760 (2012)

    Google Scholar 

  23. Song, S., Lichtenberg, S.P., Xiao, J.: Sun RGB-D: A RGB-D scene understanding benchmark suite. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 567–576 (2015)

    Google Scholar 

  24. Sturm, J., Engelhard, N., Endres, F., Burgard, W., Cremers, D.: A benchmark for the evaluation of RGB-D slam systems. In: IEEE International Conference on Intelligent Robots and Systems (IROS), pp. 573–580 (2012)

    Google Scholar 

  25. Tateno, K., Tombari, F., Laina, I., Navab, N.: CNN-SLAM: real-time dense monocular slam with learned depth prediction. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6565–6574 (2017)

    Google Scholar 

  26. Wofk, D., Ma, F., Yang, T.J., Karaman, S., Sze, V.: FastDepth: fast monocular depth estimation on embedded systems. In: IEEE International Conference on Robotics and Automation (ICRA), pp. 6101–6108 (2019)

    Google Scholar 

  27. Zhang, Z., Cui, Z., Xu, C., Yan, Y., Sebe, N., Yang, J.: Pattern-affinitive propagation across depth, surface normal and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4106–4115 (2019)

    Google Scholar 

  28. Zhou, T., Brown, M.R., Snavely, N., Lowe, D.G.: Unsupervised learning of depth and ego-motion from video. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6612–6619 (2017)

    Google Scholar 

Download references

Acknowledgements

This work is supported by the Shenzhen Science and Technology Program (JSGG20220606142803007) and the funding AC01202101103 from the Shenzhen Institute of Artificial Intelligence and Robotics for Society.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tin Lun Lam .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Hu, J. et al. (2023). Boosting LightWeight Depth Estimation via Knowledge Distillation. In: Jin, Z., Jiang, Y., Buchmann, R.A., Bi, Y., Ghiran, AM., Ma, W. (eds) Knowledge Science, Engineering and Management. KSEM 2023. Lecture Notes in Computer Science(), vol 14117. Springer, Cham. https://doi.org/10.1007/978-3-031-40283-8_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-40283-8_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-40282-1

  • Online ISBN: 978-3-031-40283-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics