Performance Evaluation of Depth Completion Neural Networks for Various RGB-D Camera Technologies in Indoor Scenarios

Castellano, Rino; Terreran, Matteo; Ghidoni, Stefano

doi:10.1007/978-3-031-47546-7_24

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14318))

Included in the following conference series:

International Conference of the Italian Association for Artificial Intelligence

481 Accesses

Abstract

RGB-D cameras have become essential in robotics for accurate perception and object recognition, enabling robots to navigate environments, avoid obstacles, and manipulate objects precisely. Such cameras, besides RGB information, allow the capture of an additional image that encodes the distance of each point in the scene from the camera. Popular depth acquisition techniques include active stereoscopic, which triangulates two camera views, and Time-of-Flight (T-o-F), based on infrared laser patterns. Despite different technologies, none of them is yet able to provide accurate depth information on the entire image due to various factors such as sunlight, reflective surfaces or high distances from the camera. This leads to noisy or incomplete depth images. Neural network-based solutions have been researched for depth completion, aiming to create dense depth maps using RGB images and sparse depth. This paper presents a comparison of the data provided by different depth-sensing technologies, highlighting their pros and cons in two main benchmark setups. After an analysis of the sensors’ accuracy under different conditions, several state-of-the-art neural networks have been evaluated in an indoor scenario to assess if it is possible to improve the quality of the raw depth images provided by each sensor.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Lidar camera l515. https://www.intelrealsense.com/lidar-camera-l515/
Stereo depth solution from intel realsense. https://www.intelrealsense.com/stereo-depth/
Chen, Y., Yang, B., Liang, M., Urtasun, R.: Learning joint 2D–3D representations for depth completion. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 10023–10032 (2019)
Google Scholar
Cheng, X., Wang, P., Yang, R.: Depth estimation via affinity learned with convolutional spatial propagation network. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 103–119 (2018)
Google Scholar
Dimitrievski, M., Veelaert, P., Philips, W.: Learning morphological operators for depth completion. In: Blanc-Talon, J., Helbert, D., Philips, W., Popescu, D., Scheunders, P. (eds.) ACIVS 2018. LNCS, vol. 11182, pp. 450–461. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01449-0_38
Chapter Google Scholar
Eldesokey, A., Felsberg, M., Khan, F.S.: Propagating confidences through CNNs for sparse data regression. arXiv preprint arXiv:1805.11913 (2018)
Garnelo, M., et al.: Conditional neural processes. In: International Conference on Machine Learning, pp. 1704–1713. PMLR (2018)
Google Scholar
Hu, J., et al.: Deep depth completion from extremely sparse data: a survey. IEEE Trans. Pattern Anal. Mach. Intell. 45, 8244–8264 (2022)
Google Scholar
Hu, M., Wang, S., Li, B., Ning, S., Fan, L., Gong, X.: PENet: towards precise and efficient image guided depth completion. In: 2021 IEEE International Conference on Robotics and Automation (ICRA), pp. 13656–13662. IEEE (2021)
Google Scholar
Huang, Z., Fan, J., Cheng, S., Yi, S., Wang, X., Li, H.: HMS-Net: hierarchical multi-scale sparsity-invariant network for sparse depth completion. IEEE Trans. Image Process. 29, 3429–3441 (2019)
Article MATH Google Scholar
Imran, S., Long, Y., Liu, X., Morris, D.: Depth coefficients for depth completion. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 12438–12447. IEEE (2019)
Google Scholar
Krogius, M., Haggenmiller, A., Olson, E.: Flexible layouts for fiducial tags. In: 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 1898–1903. IEEE (2019)
Google Scholar
Ku, J., Harakeh, A., Waslander, S.L.: In defense of classical image processing: Fast depth completion on the CPU. In: 2018 15th Conference on Computer and Robot Vision (CRV), pp. 16–22. IEEE (2018)
Google Scholar
Lee, S., Lee, J., Kim, D., Kim, J.: Deep architecture with cross guidance between single image and sparse lidar data for depth completion. IEEE Access 8, 79801–79810 (2020)
Article Google Scholar
Li, A., Yuan, Z., Ling, Y., Chi, W., Zhang, C.: A multi-scale guided cascade hourglass network for depth completion. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 32–40 (2020)
Google Scholar
Lu, K., Barnes, N., Anwar, S., Zheng, L.: Depth completion auto-encoder. In: 2022 IEEE/CVF Winter Conference on Applications of Computer Vision Workshops (WACVW), pp. 63–73. IEEE (2022)
Google Scholar
Ma, F., Karaman, S.: Sparse-to-dense: depth prediction from sparse depth samples and a single image. In: 2018 IEEE International Conference on Robotics and Automation (ICRA), pp. 4796–4803. IEEE (2018)
Google Scholar
Nazir, D., Pagani, A., Liwicki, M., Stricker, D., Afzal, M.Z.: SemAttNet: toward attention-based semantic aware guided depth completion. IEEE Access 10, 120781–120791 (2022)
Article Google Scholar
Park, J., Joo, K., Hu, Z., Liu, C.-K., So Kweon, I.: Non-local spatial propagation network for depth completion. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12358, pp. 120–136. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58601-0_8
Chapter Google Scholar
Qi, F., Han, J., Wang, P., Shi, G., Li, F.: Structure guided fusion for depth map inpainting. Pattern Recogn. Lett. 34(1), 70–76 (2013)
Article Google Scholar
Richardt, C., Stoll, C., Dodgson, N.A., Seidel, H.P., Theobalt, C.: Coherent spatiotemporal filtering, upsampling and rendering of RGBZ videos. In: Computer Graphics Forum, vol. 31, pp. 247–256. Wiley Online Library (2012)
Google Scholar
Silberman, N., Hoiem, D., Kohli, P., Fergus, R.: Indoor segmentation and support inference from RGBD images. ECCV 5(7576), 746–760 (2012)
Google Scholar
Uhrig, J., Schneider, N., Schneider, L., Franke, U., Brox, T., Geiger, A.: Sparsity invariant CNNs. In: 2017 International Conference on 3D Vision (3DV), pp. 11–20. IEEE (2017)
Google Scholar
Van Gansbeke, W., Neven, D., De Brabandere, B., Van Gool, L.: Sparse and noisy lidar completion with RGB guidance and uncertainty. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6. IEEE (2019)
Google Scholar
Zennaro, S., et al.: Performance evaluation of the 1st and 2nd generation Kinect for multimedia applications. In: 2015 IEEE International Conference on Multimedia and Expo (ICME), pp. 1–6. IEEE (2015)
Google Scholar

Download references

Author information

Authors and Affiliations

Intelligent and Autonomous Systems Laboratory, University of Padova, Padova, Italy
Rino Castellano, Matteo Terreran & Stefano Ghidoni

Authors

Rino Castellano
View author publications
You can also search for this author in PubMed Google Scholar
Matteo Terreran
View author publications
You can also search for this author in PubMed Google Scholar
Stefano Ghidoni
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Matteo Terreran .

Editor information

Editors and Affiliations

University of Rome Tor Vergata, Rome, Italy
Roberto Basili
Sapienza University of Rome, Rome, Italy
Domenico Lembo
Roma Tre University, Rome, Italy
Carla Limongelli
National Research Council, Rome, Italy
Andrea Orlandini

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Castellano, R., Terreran, M., Ghidoni, S. (2023). Performance Evaluation of Depth Completion Neural Networks for Various RGB-D Camera Technologies in Indoor Scenarios. In: Basili, R., Lembo, D., Limongelli, C., Orlandini, A. (eds) AIxIA 2023 – Advances in Artificial Intelligence. AIxIA 2023. Lecture Notes in Computer Science(), vol 14318. Springer, Cham. https://doi.org/10.1007/978-3-031-47546-7_24

Download citation

DOI: https://doi.org/10.1007/978-3-031-47546-7_24
Published: 02 November 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-47545-0
Online ISBN: 978-3-031-47546-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Performance Evaluation of Depth Completion Neural Networks for Various RGB-D Camera Technologies in Indoor Scenarios