Unsupervised Single-View Depth Estimation for Real Time Inference

Siddiqui, Mohammed Arshad; Jain, Arpit; Gour, Neha; Khanna, Pritee

doi:10.1007/978-981-15-4015-8_9

Mohammed Arshad Siddiqui⁹,
Arpit Jain⁹,
Neha Gour⁹ &
…
Pritee Khanna⁹

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1147))

Included in the following conference series:

International Conference on Computer Vision and Image Processing

779 Accesses

Abstract

Several approaches using unsupervised methods have been proposed recently to perform the task of depth prediction with higher accuracy. However, none of these approaches are flexible enough to be deployed in the real-time environment with limited computational capabilities. Inference latency is a major factor that limits the application of such methods to the real world scenarios where high end GPUs cannot be deployed. Six models based on three approaches are proposed in this work to reduce inference latency of depth prediction solutions without losing accuracy. The proposed solutions can be deployed in real-world applications with limited computational power and memory. The new models are also compared with the models recently proposed in literature to establish a state of the art depth prediction model that can be used in real-time.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

GeoRefine: Self-supervised Online Depth Refinement for Accurate Dense Mapping

Towards Domain-agnostic Depth Completion

Article 29 May 2024

Depth Map Upsampling via Progressive Manner Based on Probability Maximization

References

Abrams, A., Hawley, C., Pless, R.: Heliometric stereo: shape from sun position. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7573, pp. 357–370. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33709-3_26
Chapter Google Scholar
Eigen, D., Puhrsch, C., Fergus, R.: Depth map prediction from a single image using a multi-scale deep network. In: Advances in Neural Information Processing Systems, pp. 2366–2374 (2014)
Google Scholar
Furukawa, Y., Hernández, C., et al.: Multi-view stereo: a tutorial. Found. Trends Comput. Graph. Vis. 9(1–2), 1–148 (2015)
Article Google Scholar
Garg, R., Vijay Kumar, B.G., Carneiro, G., Reid, I.: Unsupervised CNN for single view depth estimation: geometry to the rescue. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9912, pp. 740–756. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46484-8_45
Chapter Google Scholar
Geiger, A., Lenz, P., Stiller, C., Urtasun, R.: Vision meets robotics: the Kitti dataset. Int. J. Robot. Res. (IJRR) 32(11), 1231–1237 (2013)
Article Google Scholar
Godard, C., Mac Aodha, O., Brostow, G.J.: Unsupervised monocular depth estimation with left-right consistency. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 270–279 (2017)
Google Scholar
Iandola, F.N., Moskewicz, M.W., Ashraf, K., Han, S., Dally, W.J., Keutzer, K.: Squeezenet: Alexnet-level accuracy with 50x fewer parameters and $<$1mb model size. CoRR (2017)
Google Scholar
Karsch, K., Liu, C., Kang, S.B.: Depth transfer: depth extraction from video using non-parametric sampling. IEEE Trans. Pattern Anal. Mach. Intell. 36, 2144–2158 (2014)
Article Google Scholar
Ladickỳ, L., Häne, C., Pollefeys, M.: Learning the matching function. arXiv preprint arXiv:1502.00652 (2015)
Ladicky, L., Shi, J., Pollefeys, M.: Pulling things out of perspective. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 89–96. IEEE (2014)
Google Scholar
Liu, F., Shen, C., Lin, G., Reid, I.: Learning depth from single monocular images using deep convolutional neural fields. IEEE Trans. Pattern Anal. Mach. Intell. 38(10), 2024–2039 (2015)
Article Google Scholar
Mahjourian, R., Wicke, M., Angelova, A.: Unsupervised learning of depth and ego-motion from monocular video using 3D geometric constraints. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5667–5675 (2018)
Google Scholar
Mayer, N., et al.: A large dataset to train convolutional networks for disparity, optical flow, and scene flow estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4040–4048 (2016)
Google Scholar
Molchanov, P., Tyree, S., Karras, T., Aila, T., Kautz, J.: Pruning convolutional neural networks for resource efficient inference. In: International Conference on Learning Representations (2017)
Google Scholar
Murray, D., Little, J.J.: Using real-time stereo vision for mobile robot navigation. Auton. Robot. 8(2), 161–171 (2000)
Article Google Scholar
Nath Kundu, J., Krishna Uppala, P., Pahuja, A., Venkatesh Babu, R.: AdaDepth: unsupervised content congruent adaptation for depth estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2656–2665 (2018)
Google Scholar
Ranftl, R., Vineet, V., Chen, Q., Koltun, V.: Dense monocular depth estimation in complex dynamic scenes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4058–4066 (2016)
Google Scholar
Scharstein, D., Szeliski, R.: A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. Int. J. Comput. Vision 47(1–3), 7–42 (2002)
Article Google Scholar
Xie, J., Girshick, R., Farhadi, A.: Deep3D: fully automatic 2D-to-3D video conversion with deep convolutional neural networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 842–857. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46493-0_51
Chapter Google Scholar
Yang, Z., Wang, P., Xu, W., Zhao, L., Nevatia, R.: Unsupervised learning of geometry from videos with edge-aware depth-normal consistency. In: Thirty-Second AAAI Conference on Artificial Intelligence (2018)
Google Scholar
Yusiong, J.P.T., Naval, P.C.: AsiaNet: autoencoders in autoencoder for unsupervised monocular depth estimation. In: IEEE Winter Conference on Applications of Computer Vision, pp. 443–451 (2019)
Google Scholar
Zhou, T., Brown, M., Snavely, N., Lowe, D.G.: Unsupervised learning of depth and ego-motion from video. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1851–1858 (2017)
Google Scholar
Zhou, T., Krähenbühl, P., Aubry, M., Huang, Q.X., Efros, A.A.: Learning dense correspondence via 3D-guided cycle consistency. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 117–126 (2016)
Google Scholar
Zhou, T., Tulsiani, S., Sun, W., Malik, J., Efros, A.A.: View synthesis by appearance flow. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 286–301. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46493-0_18
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

PDPM Indian Institute of Information Technology, Design and Manufacturing, Jabalpur, India
Mohammed Arshad Siddiqui, Arpit Jain, Neha Gour & Pritee Khanna

Authors

Mohammed Arshad Siddiqui
View author publications
You can also search for this author in PubMed Google Scholar
Arpit Jain
View author publications
You can also search for this author in PubMed Google Scholar
Neha Gour
View author publications
You can also search for this author in PubMed Google Scholar
Pritee Khanna
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mohammed Arshad Siddiqui .

Editor information

Editors and Affiliations

Malaviya National Institute of Technology, Jaipur, Rajasthan, India
Neeta Nain
Malaviya National Institute of Technology, Jaipur, Rajasthan, India
Santosh Kumar Vipparthi
Indian Institute of Technology Roorkee, Roorkee, Uttarakhand, India
Balasubramanian Raman

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Siddiqui, M.A., Jain, A., Gour, N., Khanna, P. (2020). Unsupervised Single-View Depth Estimation for Real Time Inference. In: Nain, N., Vipparthi, S., Raman, B. (eds) Computer Vision and Image Processing. CVIP 2019. Communications in Computer and Information Science, vol 1147. Springer, Singapore. https://doi.org/10.1007/978-981-15-4015-8_9

Download citation

DOI: https://doi.org/10.1007/978-981-15-4015-8_9
Published: 29 March 2020
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-4014-1
Online ISBN: 978-981-15-4015-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics