Abstract
This paper introduces a novel neuro-geometric methodology for single image object depth estimation, abbreviated as NGDE. The proposed methodology can be described as a hybrid methodology since it combines a geometrical and a deep learning component. NGDE leverages the geometric camera model to initially estimate a set of probable depth values between the camera and the object. Then, these probable depth values along with the pixel coordinates that define the boundaries of an object, are propagated to a deep learning component, appropriately trained to output the final object depth estimation. Unlike previous approaches, NGDE does not require any prior information about the scene, such as the horizon line or reference objects. Instead, NGDE uses a virtual 3D point cloud projected to the 2D image plane that is used to estimate probable depth values indicated by 3D-2D point correspondences. Then by leveraging a multilayer perceptron (MLP) that considers both the probable depth values and the 2D bounding box of the object, NGDE is capable of accurately estimating the depth of an object. A major advantage of NGDE over the state-of-the-art deep learning-based methods is that it utilizes a simple MLP instead of computationally complex Convolutional Neural Networks (CNNs). The evaluation of NGDE on KITTI indicates its advantageous performance over relevant state-of-the-art approaches.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Alhashim, I., Wonka, P.: High quality monocular depth estimation via transfer learning. arXiv preprint arXiv:181211941 (2018)
Ali, A., Hassan, A., Ali, A.R., Khan, H.U., Kazmi, W., Zaheer, A.: Real-time vehicle distance estimation using single view geometry. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) (2020)
Chen, S., Fang, X., Shen, J., Wang, L., Shao, L.: Single-image distance measurement by a smart mobile device. IEEE Trans. Cybernet. 47, 4451–4462 (2016)
Dimas, G., Bianchi, F., Iakovidis, D.K., Karargyris, A., Ciuti, G., Koulaouzidis, A.: Endoscopic single-image size measurements. Meas. Sci. Technol. 31, 074010 (2020)
Dimas, G., Gatoula, P., Iakovidis, D.K.: MonoSOD: monocular salient object detection based on predicted depth. In: 2021 IEEE International Conference on Robotics and Automation (ICRA), pp 4377–4383. IEEE (2021)
Falkenhagen, L.: Depth estimation from stereoscopic image pairs assuming piecewise continuos surfaces. In: Image Processing for Broadcast and Video Production: Proceedings of the European Workshop on Combined Real and Synthetic Image Processing for Broadcast and Video Production, Hamburg, 23–24 November 1994, pp 115–127. Springer, Cham (1995). https://doi.org/10.1007/978-1-4471-3035-2
Fu, H, Gong, M., Wang, C., Batmanghelich, K., Tao, D.: Deep ordinal regression network for monocular depth estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 2002–2011 (2018)
Geiger, A., Lenz, P., Urtasun, R.: Are we ready for autonomous driving? The KITTI vision benchmark suite. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp 3354–3361 (2012)
Godard, C., Mac Aodha, O., Brostow, G.J.: Unsupervised monocular depth estimation with left-right consistency. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 270–279 (2017)
Godard, C., Mac Aodha, O., Firman, M., Brostow, G.J.: Digging into self-supervised monocular depth estimation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 3828–3838 (2019)
Heikkila, J., Silvén, O.: A four-step camera calibration procedure with implicit image correction. In: Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 1106–1112. IEEE (1997)
Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 4700–4708 (2017)
Jiao, J., Cao, Y., Song, Y., Lau, R.: Look deeper into depth: monocular depth estimation with semantic booster and attention-driven loss. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 53–69 (2018)
Johari, M.M., Carta, C., Fleuret, F.: (2021) DepthInSpace: exploitation and fusion of multiple video frames for structured-light depth estimation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 6039–6048
Liu, B., Gould, S., Koller, D.: Single image depth estimation from predicted semantic labels. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp 1253–1260. IEEE (2010)
Ming, Y., Meng, X., Fan, C., Yu, H.: Deep learning for monocular depth estimation: a review. Neurocomputing 438, 14–33 (2021)
Park, H., Van Messemac, A., De Neveac, W.: Box-Scan: an efficient and effective algorithm for box dimension measurement in conveyor systems using a single RGB-D camera. In: Proceedings of the 7th IIAE International Conference on Industrial Application Engineering, Kitakyushu, Japan, pp. 26–30 (2019)
Shuai, S., et al.: Research on 3D surface reconstruction and body size measurement of pigs based on multi-view RGB-D cameras. Comput. Electron. Agric. 175, 105543 (2020)
Spencer, J., et al.: The monocular depth estimation challenge. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp 623–632 (2023)
Tyagi, K., et al.: Regression analysis. In: Artificial Intelligence and Machine Learning for EDGE Computing, pp 53–63. Elsevier (2022)
Valentin, J., et al.: Depth from motion for smartphone AR. ACM Trans. Graph. (ToG) 37, 1–19 (2018)
Yang, X., Luo, H., Wu, Y., Gao, Y., Liao, C., Cheng, K.-T.: Reactive obstacle avoidance of monocular quadrotors with online adapted depth prediction network. Neurocomputing 325, 142–158 (2019)
Zhou, K., et al.: Devnet: Self-supervised monocular depth learning via density volume construction. In: Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, 23–27 October 2022, Proceedings, Part XXXIX, pp 125–142. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-19842-7_8
Acknowledgements
We acknowledge support of this work by the project “Smart Tourist” (MIS 5047243) which is implemented under the Action “Reinforcement of the Research and Innovation Infrastructure”, funded by the Operational Programme "Competitiveness, Entrepreneurship and Innovation" (NSRF 2014–2020) and co-financed by Greece and the European Union (European Regional Development Fund).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Dimas, G., Gatoula, P., Iakovidis, D.K. (2023). A Single Image Neuro-Geometric Depth Estimation. In: Blanc-Talon, J., Delmas, P., Philips, W., Scheunders, P. (eds) Advanced Concepts for Intelligent Vision Systems. ACIVS 2023. Lecture Notes in Computer Science, vol 14124. Springer, Cham. https://doi.org/10.1007/978-3-031-45382-3_14
Download citation
DOI: https://doi.org/10.1007/978-3-031-45382-3_14
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-45381-6
Online ISBN: 978-3-031-45382-3
eBook Packages: Computer ScienceComputer Science (R0)