Structured deep learning based object-specific distance estimation from a monocular image

Shi, Yu; Lin, Tao; Chen, Biao; Wang, Ruixia; Zhang, Yabo

doi:10.1007/s13042-023-01887-6

Structured deep learning based object-specific distance estimation from a monocular image

Original Article
Published: 11 June 2023

Volume 14, pages 4151–4161, (2023)
Cite this article

International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Yu Shi¹^na1,
Tao Lin ORCID: orcid.org/0000-0002-3440-3874¹^na1,
Biao Chen¹,
Ruixia Wang¹ &
…
Yabo Zhang¹

263 Accesses
1 Citation
Explore all metrics

Abstract

Distance calculation is a critical link in the research fields of object trajectory prediction, automatic driving obstacle avoidance, and so on. However, the research on distance using deep learning methods has yet to attract wide attention. The accuracy of traditional distance estimation algorithms based on the optical principle and mathematical modeling is low in practical applications, mainly the curve or slope of the road surface. This paper addresses the challenging distance estimation problem by developing an end-to-end structured model to directly predict the distance for objects in a given image. Besides, the traditional mathematical modeling process is replaced by this learning-based method. To facilitate the research on this task, we construct the extended distance datasets by KITTI (Karlsruhe Institute of Technology and Toyota Technological Institute) and NYU(Nathan Silberman, Pushmeet Kohli, Derek Hoiem, Rob Fergus) Depth V2 distance datasets. Experimental results demonstrate that the structured learning model has higher accuracy than the traditional algorithm in different distance ranges and better performance for curves and ramps. Moreover, improving neural network performance will be the direction of improving the model in the future.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Vehicle-Related Distance Estimation Using Customized YOLOv7

CNN-Based Object Detection and Distance Prediction for Autonomous Driving Using Stereo Images

Article 12 May 2023

Multi-DisNet: Machine Learning-Based Object Distance Estimation from Multiple Cameras

Availability of data and materials

Not applicable.

References

Stein GP, Mano O, Shashua A (2003) Vision-based acc with a single camera: bounds on range and range rate accuracy. In: IEEE IV2003 Intelligent Vehicles Symposium. Proceedings (Cat. No. 03TH8683), pp 120–125. IEEE
McCarthy JA (2010) Internet sexual activity: a comparison between contact and non-contact child pornography offenders. J Sex Aggress 16(2):181–195
Article Google Scholar
Bieman LH (1989) Survey of design considerations for 3-d imaging systems. In: Svetkoff DJ (ed) Optics, illumination, and image sensing for machine vision III, vol 1005. SPIE, Cambridge, Massachusetts, pp 138–144
Marr D, Poggio T (1979) A computational theory of human stereo vision. Proc R Soc Lond Ser B Biol Sci 204(1156):301–328
Google Scholar
Rogers B, Graham M (1979) Motion parallax as an independent cue for depth perception. Perception 8(2):125–134
Article Google Scholar
Rajagopalan AN, Chaudhuri S (1997) Space-variant approaches to recovery of depth from defocused images. Comput Vis Image Underst 68(3):309–329
Article Google Scholar
Pentland AP (1987) A new sense for depth of field. IEEE Trans Pattern Anal Mach Intell 4:523–531
Article Google Scholar
Zhu J, Fang Y (2019) Learning object-specific distance from a monocular image. In: Proceedings of the IEEE/CVF International Conference on computer vision, pp 3839–3848
Saxena A, Sun M, Ng AY (2007) Learning 3-d scene structure from a single still image. In: 2007 IEEE 11th International Conference on computer vision, pp 1–8. IEEE
Liu F, Shen C, Lin G, Reid I (2015) Learning depth from single monocular images using deep convolutional neural fields. IEEE Trans Pattern Anal Mach Intell 38(10):2024–2039
Article Google Scholar
Liu M, Salzmann M, He X (2014) Discrete-continuous depth estimation from a single image. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp 716–723
Rezaei M, Terauchi M, Klette R (2015) Robust vehicle detection and distance estimation under challenging lighting conditions. IEEE Trans Intell Transp Syst 16(5):2723–2743
Article Google Scholar
Tuohy S, O’Cualain D, Jones E, Glavin M (2010) IET Irish Signals and Systems Conference (ISSC 2010), Distance determination for an automobile environment using Inverse Perspective Mapping in OpenCV, 100–105. https://doi.org/10.1049/cp.2010.0495
Gökçe F, Üçoluk G, Şahin E, Kalkan S (2015) Vision-based detection and distance estimation of micro unmanned aerial vehicles. Sensors 15(9):23805–23846
Article Google Scholar
Haseeb MA, Guan J, Ristic-Durrant, D, Gräser A (2018) Disnet: a novel method for distance estimation from monocular camera. In: 10th Planning, Perception and Navigation for Intelligent Vehicles (PPNIV18), IROS
Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C-Y, Berg AC (2016) Ssd: Single shot multibox detector. In: European Conference on computer vision, pp 21–37. Springer
Redmon J, Farhadi A (2017) Yolo9000: better, faster, stronger. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp 7263–7271
Girshick R (2015) Fast r-cnn. In: Proceedings of the IEEE International Conference on computer vision, pp 1440–1448
Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp 580–587
Ren S, He K, Girshick R, Sun J (2015) IEEE Transactions on Pattern Analysis and Machine Intelligence, Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks, 39(6)1137–1149. https://doi.org/10.1109/TPAMI.2016.2577031
Eigen D, Puhrsch C, Fergus R (2014) Depth map prediction from a single image using a multi-scale deep network. arXiv:1406.2283. https://doi.org/10.48550/arXiv.1406.2283
Kuznietsov Y, Stuckler J, Leibe B (2017) Semi-supervised deep learning for monocular depth map prediction. In: Proceedings of the IEEE Conference on Computer vision and pattern recognition, pp 6647–6655
Yang N, Wang R, Stuckler J, Cremers D (2018) Deep virtual stereo odometry: Leveraging deep depth prediction for monocular direct sparse odometry. In: Proceedings of the European Conference on computer vision (ECCV), pp 817–833
Tulyakov S, Ivanov A, Fleuret F (2016) Semi-supervised learning of deep metrics for stereo reconstruction. arXiv preprint arXiv:1612.00979
Liu F, Shen C, Lin G (2015) Deep convolutional neural fields for depth estimation from a single image. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp 5162–5170
Farabet C, Couprie C, Najman L, LeCun Y (2012) Learning hierarchical features for scene labeling. IEEE Trans Pattern Anal Mach Intell 35(8):1915–1929
Article Google Scholar
Tompson JJ, Jain A, LeCun Y, Bregler C (2014) Joint training of a convolutional network and a graphical model for human pose estimation. arXiv:1406.2984
Li Y et al (2017) Structured deep learning based depth estimation from a monocular image. Jiqiren/Robot 39(6)812–819
Movassagh AA, Alzubi JA, Gheisari M, Rahimi M, Mohan S, Abbasi AA, Nabipour N (2021) Artificial neural networks training algorithm integrating invasive weed optimization with differential evolutionary model. J Ambient Intell Humaniz Comput, 1–9
Alzubi JA, Jain R, Nagrath P, Satapathy S, Taneja S, Gupta P (2021) Deep image captioning using an ensemble of cnn and lstm based deep neural networks. J Intell Fuzzy Syst 40(4):5761–5769
Article Google Scholar
Banan A, Nasiri A, Taheri-Garavand A (2020) Deep learning-based appearance features extraction for automated carp species identification. Aquacult Eng 89:102053
Article Google Scholar
Afan HA, Ibrahem Ahmed Osman A, Essam Y, Ahmed AN, Huang YF, Kisi O, Sherif M, Sefelnasr A, Chau K-W, El-Shafie A (2021) Modeling the fluctuations of groundwater level by employing ensemble deep learning techniques. Eng Appl Comput Fluid Mech 15(1):1420–1439
Google Scholar
Fan Y, Xu K, Wu H, Zheng Y, Tao B (2020) Spatiotemporal modeling for nonlinear distributed thermal processes based on kl decomposition, mlp and lstm network. IEEE Access 8:25111–25121
Article Google Scholar
Chen W, Sharifrazi D, Liang G, Band SS, Chau KW, Mosavi A (2022) Accurate discharge coefficient prediction of streamlined weirs by coupling linear regression and deep convolutional gated recurrent unit. Eng Appl Comput Fluid Mech 16(1):965–976
Google Scholar
Chen C, Zhang Q, Kashani MH, Jun C, Bateni SM, Band SS, Dash SS, Chau K-W (2022) Forecast of rainfall distribution based on fixed sliding window long short-term memory. Eng Appl Comput Fluid Mech 16(1):248–261
Google Scholar
Wang W-C, Du Y-J, Chau K-W, Xu D-M, Liu C-J, Ma Q (2021) An ensemble hybrid forecasting model for annual runoff based on sample entropy, secondary decomposition, and long short-term memory neural network. Water Resour Manag 35(14):4695–4726
Article Google Scholar
He K, Zhang X, Ren S, Sun J (2015) Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans Pattern Anal Mach Intell 37(9):1904–1916
Article Google Scholar
Geiger A, Lenz P, Urtasun R (2012) Are we ready for autonomous driving? the kitti vision benchmark suite. In: 2012 IEEE Conference on computer vision and pattern recognition, pp 3354–3361. IEEE
Silberman N, Hoiem D, Kohli P, Fergus R (2012) Indoor segmentation and support inference from rgbd images. In: European Conference on computer vision, pp 746–760. Springer

Download references

Funding

Not applicable.

Author information

Yu Shi, Tao Lin have contributed equally to this work.

Authors and Affiliations

School of Computer Science and Information Engineering, Shanghai Institute of Technology, HaiQuan Road, Shanghai, 201418, People’s Republic of China
Yu Shi, Tao Lin, Biao Chen, Ruixia Wang & Yabo Zhang

Authors

Yu Shi
View author publications
You can also search for this author in PubMed Google Scholar
Tao Lin
View author publications
You can also search for this author in PubMed Google Scholar
Biao Chen
View author publications
You can also search for this author in PubMed Google Scholar
Ruixia Wang
View author publications
You can also search for this author in PubMed Google Scholar
Yabo Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tao Lin.

Ethics declarations

Conflict of interest

There is no conflict of interest.

Ethics approval

Not applicable.

Consent to participate

Not applicable.

Consent for publication

Not applicable.

Code availability

Not applicable

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Shi, Y., Lin, T., Chen, B. et al. Structured deep learning based object-specific distance estimation from a monocular image. Int. J. Mach. Learn. & Cyber. 14, 4151–4161 (2023). https://doi.org/10.1007/s13042-023-01887-6

Download citation

Received: 22 October 2022
Accepted: 25 May 2023
Published: 11 June 2023
Issue Date: December 2023
DOI: https://doi.org/10.1007/s13042-023-01887-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Structured deep learning based object-specific distance estimation from a monocular image

Abstract

Access this article

Similar content being viewed by others

Vehicle-Related Distance Estimation Using Customized YOLOv7

CNN-Based Object Detection and Distance Prediction for Autonomous Driving Using Stereo Images

Multi-DisNet: Machine Learning-Based Object Distance Estimation from Multiple Cameras

Availability of data and materials

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Ethics approval

Consent to participate

Consent for publication

Code availability

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation