Abstract
Deep-learning neural networks (DNN) have been acknowledged to capably solve the ill-posed monocular depth estimation problem in self-drive applications. In this paper, we proposed a dense residual multi-resolution supervised DNN toward accurate monocular depth estimations for traffic landscape scenes. The proposed DNN is constructed by regularly integrating the dense residual short-cut connections into the multi-resolution backbone. However, since some implicitly influential features cannot be viable at the end of learning, the DNN structure for generating the details of estimated monocular depths shall not be too deepened. Basically, the structural depth of DNN can be suppressed by effectively exploiting the functional residual connections. In the proposed DNN structure, the amount of short-cut connections can be moderate through rational employments. Particularly, for achieving high modularization, we address three-layered modules to generate the adequate levels and layers in which the results can easily be controlled to meet a requested prediction/inference accuracy. Therefore, the visualization and quantitative results can demonstrate the superiority of the proposed DNN to other compared DNNs for street landscape experiments.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Eigen, D., Puhrsch, C., Fergus, R.: Depth map prediction from a single image using a multi-scale deep network. In: Advances in Neural Information Processing Systems, vol. 27 (NIPS), December 2014
Song, M., Kim, W.: Depth estimation from a single image using guided deep network. IEEE Access 7, 142595–142606 (2019)
Kim, Y., Jung, H., Min, D., Sohn, K.: Deep monocular depth estimation via integration of global and local predictions. IEEE Trans. Image Process. 27(8), 4131–4144 (2018)
Fu, H., Gong, M., Wang, C., Batmanghelich, K., Tao, D.: Deep ordinal regression network for monocular depth estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2002–2011 (2018)
Alhashim, I., Wonka, P.: High quality monocular depth estimation via transfer learning. arXiv preprint arXiv:1812.11941 (2018)
Hu, J., Ozay, M., Zhang, Y., Okatani, T.: Revisiting single image depth estimation: toward higher resolution maps with accurate object boundaries. In: IEEE Winter Conference on Applications of Computer Vision, Waikoloa Village, HI, USA, pp. 1043–1051, March 2019
Godard, C., Mac Aodha, O., Brostow, G.J.: Unsupervised monocular depth estimation with left-right consistency. In: Proceedings IEEE Conference on Computer Vision and Pattern Recognition, pp. 270–279, July 2017
Pilzer, A., Lathuili`ere, S., Sebe, N., Ricci, E.: Refine and distill: exploiting cycle-inconsistency and knowledge distillation for unsupervised monocular depth estimation. CVPR, pp. 9768–9777, June 2019
Wong, A., Soatto, S.: Bilateral Cyclic Constraint and Adaptive Regularization for Unsupervised Monocular Depth Prediction, pp. 5644–5653. CVPR, Open Access paper (June 2019)
Ye, X., Fan, X., Zhang, M., Xu, R., Zhong, W.: Unsupervised monocular depth estimation via recursive stereo distillation. IEEE Trans. Image Process. 30, 4492–4504 (2021)
Jiao, J., Cao, Y., Song, Y., Lau, R.: Look deeper into depth: monocular depth estimation with semantic booster and attention-driven loss. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11219, pp. 55–71. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01267-0_4
Godard, C., Aodha, O.M., Firman, M., Brostow, G.: Digging into self-supervised monocular depth estimation, pp. 3828–3838. ICCV, Open Access paper (Oct. 2019)
Lee, J.H., Han, M.-K., Ko, D.W., Suh, I.H.: From big to small: multi-scale local planar guidance for monocular depth estimation, CVPR, arXiv preprint arXiv:1907.10326, June 2020
Song, X., et al.: MLDA-Net: multi-level dual attention-based network for self-supervised monocular depth estimation. IEEE Trans. Image Process. 30, 4691–4705 (2021)
Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 5693–5703 (2019)
Acknowledgments
This paper is supported by the funding granted by Ministry of Science and Technology of Taiwan, MOST 109–2221-E-415 -016 -
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Chan, D.Y., Chang, CI., Wu, P.H., Chiang, C.C. (2022). Multi-resolution Dense Residual Networks with High-Modularization for Monocular Depth Estimation. In: Vasant, P., Zelinka, I., Weber, GW. (eds) Intelligent Computing & Optimization. ICO 2021. Lecture Notes in Networks and Systems, vol 371. Springer, Cham. https://doi.org/10.1007/978-3-030-93247-3_19
Download citation
DOI: https://doi.org/10.1007/978-3-030-93247-3_19
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-93246-6
Online ISBN: 978-3-030-93247-3
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)