Cross-based dense depth estimation by fusing stereo vision with measured sparse depth

Mo, Hongbao; Li, Baoquan; Shi, Wuxi; Zhang, Xuebo

doi:10.1007/s00371-022-02594-z

Cross-based dense depth estimation by fusing stereo vision with measured sparse depth

Original article
Published: 03 August 2022

Volume 39, pages 4339–4350, (2023)
Cite this article

The Visual Computer Aims and scope Submit manuscript

Hongbao Mo¹,
Baoquan Li ORCID: orcid.org/0000-0001-8390-3907¹,
Wuxi Shi¹ &
…
Xuebo Zhang²

293 Accesses
2 Citations
1 Altmetric
Explore all metrics

Abstract

Dense depth estimation is significant in robotic systems, such as for mapping, localization, and object recognition. For multiple sensors, an active depth sensor can provide accurate but sparse measurements for environments, and a camera pair can provide dense but imprecise stereo reconstruction results. In this paper, a tightly coupled fusion method is proposed for depth sensor and stereo camera to complete dense depth estimation, and advantages of the two type sensors are combined so as to achieve better depth estimation. An adaptive dynamic cross-arm algorithm are developed to integrate sparse depth measurements into camera-dominated semiglobal stereo matching. To obtain the optimal arm length for a measured pixel point, each cross-arm shape is variational and calculated automatically. Public datasets of KITTI, Middlebury, and Scene Flow datasets are used with comparison experiments to test performance of the proposed method, and real-world experiments are further conducted for verification.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Monocular 3D Object Detection with Depth from Motion

Dense 3D Mapping for Indoor Environment Based on Kinect-Style Depth Cameras

Efficient Sparse to Dense Stereo Matching Technique

Notes

https://github.com/TGUMobileVision/CrossArmFusion.

References

Choe, J., Joo, K., Imtiaz, T., Kweon, I.S.: Volumetric propagation network: stereo-LiDAR fusion for long-range depth estimation. IEEE Robot. Autom. Lett. 6(3), 4672–4679 (2021)
Article Google Scholar
Evangelidis, G.D., Hansard, M., Horaud, R.: Fusion of range and stereo data for high-resolution scene-modeling. IEEE Trans. Pattern Anal. Mach. Intell. 37(11), 2178–2192 (2015)
Article Google Scholar
Chen, S., Zhang, J., Jin, M.: A simplified ICA-based local similarity stereo matching. Vis. Comput. 37, 411–419 (2021)
Article Google Scholar
Hernandez-Juarez, D., Chacón, A., Espinosa, A., Vázquez, D., Moure, J.C., López, A.M.: Embedded real-time stereo estimation via semi-global matching on the GPU. Procedia Comput. Sci. 80, 143–153 (2016)
Article Google Scholar
Kraft, H., Frey, J., Moeller, T., Albrecht, M., Grothof, M., Schink, B., Hess, H., Buxbaum, B.: “3D-camera of high 3D-frame rate, depth-resolution and background light elimination based on improved PMD (photonic mixer device)-technologies,” In: OPTO, Nuernberg, (2004)
Beder, C., Bartczak, B., Koch, R.: “A comparison of PMD-cameras and stereo-vision for the task of surface reconstruction using patchlets,” In: Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern. Recognit., Minneapolis, MN, USA, pp. 1–8 (2007)
Jian, M., Dong, J., Gong, M., Yu, H., et al.: Learning the traditional art of Chinese calligraphy via three-dimensional reconstruction and assessment. IEEE Trans. Multimed. 22(4), 970–979 (2020)
Article Google Scholar
Scharstein, D., Szeliski, R.: A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. Int. J. Comput. Vis. 47, 7–42 (2002)
Article MATH Google Scholar
Brown, M.Z., Burschka, D., Hager, G.D.: Advances in computational stereo. IEEE Trans. Pattern Anal. Mach. Intell. 25(8), 993–1008 (2003)
Article Google Scholar
Rother, C., Carlsson, S.: Linear multi view reconstruction and camera recovery using a reference plane. Int. J. Comput. Vis. 49, 117–141 (2002)
Article MATH Google Scholar
Hirschmuller, H.: Stereo processing by semiglobal matching and mutual information. IEEE Trans. Pattern Anal. Mach. Intell. 30(2), 328–341 (2008)
Article Google Scholar
Zbontar, J., LeCun, Y.: Stereo matching by training a convolutional neural network to compare image patches. J. Mach. Learn. Res. 17(1), 2287–2318 (2016)
MATH Google Scholar
Hambarde, P., Murala, S.: S2DNet: Depth estimation from single image and sparse samples. IEEE Trans. Comput. Imag. 6, 806–817 (2020)
Article Google Scholar
Zhao, T., Pan, S., Gao, W., Sheng, C., Sun, Y., Wei, J.: Attention Unet++ for lightweight depth estimation from sparse depth samples and a single RGB image. Vis. Comput. (2021). https://doi.org/10.1007/s00371-021-02092-8
Article Google Scholar
Wang, T.H., Hu, H.N., Lin, C.H., Tsai, Y.-H., Chiu, W.-C., Sun, M.: “3D LiDAR and stereo fusion using stereo matching network with conditional cost volume normalization,” In: Proc. IEEE/RSJ Int. Conf. Intell. Robots Syst., Macau, China, pp. 5895–5902 (2019)
Park, K., Kim, S., Sohn, K.: High-precision depth estimation using uncalibrated LiDAR and stereo fusion. IEEE Trans. Intell. Transp. Syst. 21(1), 321–335 (2020)
Article Google Scholar
Zhang, J., Ramanagopalg, M.S., Vasudevan, R., Johnson-Roberson, M.: “LiStereo: Generate dense depth maps from LiDAR and stereo imagery,” In: Proc. IEEE Int. Conf. Robot. Autom., Paris, France, pp. 7829–7836 (2020)
Badino, H., Huber, D., Kanade, T.: “Integrating LiDAR into stereo for fast and improved disparity computation,” In: Proc. Int. Conf. 3D Imaging Model. Process. Vis. Transm., Hangzhou, China, pp. 405–412 (2011)
Maddern, W., Newman, P.: “Real-time probabilistic fusion of sparse 3D LiDAR and dense stereo,” In: Proc IEEE/RSJ Int. Conf. Intell. Robots Syst., Daejeon, Korea, pp. 2181–2188 (2016)
Veitch-Michaelis, J., Muller, J., Storey, J., Walton, D., Foster, M.: Data fusion of LiDAR into a region growing stereo algorithm. Int. Arch. Photogramm Remote Sens. Spat. Inf. Sci. 40(4), 107 (2015)
Article Google Scholar
Courtois, H., Aouf, N.: “Fusion of stereo and LiDAR data for dense depth map computation,” In: Proc. Workshop Res. Educ. Dev. Unman. Aerial Syst., Linkoping, Sweden, pp. 186–191 (2017)
Premebida, C., Garrote, L., Asvadi, A., Ribeiro, A.P., Nunes, U.: “High-resolution LiDAR-based depth mapping using bilateral filter,” In: Proc. Int. Conf. Intell. Transp. Syst., Rio de Janeiro, Brazil, pp. 2469–2474 (2016)
Kuhnert, K., Stommel, M.: “Fusion of stereo-camera and PMD-camera data for real-time suited precise 3D environment reconstruction,” In: Proc. IEEE/RSJ Int. Conf. Intell. Robots Syst., Beijing, China, pp. 4780–478 (2006)
Shim, H., Adelsberger, R., Kim, J.D., Rhee, S.M., Rhee, T., Sim, J.Y., et al.: Time-of-flight sensor and color camera calibration for multi-view acquisition. The Vis. Comput. 28, 1139–1151 (2012)
Article Google Scholar
Agresti, G., Zanuttigh, P.: “Combination of spatially-modulated ToF and structured light for MPI-free depth estimation,” In: Proc. Euro. Conf. Comput. Vis., Munich, Germany, pp. 355–371 (2018)
Zhao, T., Pan, S., Gao, W., Sun, Y.: Learning modal and spatial features with lightweight 3D convolution for RGB guided depth completion. IEEE Trans. Consum. Electron. 67(3), 195–201 (2020)
Google Scholar
Cheng, X., Zhong, Y., Harandi, M., Dai, Y., et al.: Hierarchical neural architecture search for deep stereo matching. Adv. Neural Inform. Proc. Syst. 33, 22158–22169 (2020)
Google Scholar
Oberle, W.F., Davis, L.: “Toward high resolution, ladar-quality 3D world models using ladar-stereo data integration and fusion,” Army Rese. Labor., (2005)
Yang, Q., Tan, K., Culbertson, B., Apostolopoulos, J.: “Fusion of active and passive sensors for fast 3D capture,” In: Proc. IEEE Int. Workshop Multi. Sig., Saint-Malo, France, pp. 69–74 (2010)
Poggi, M., Agresti, G., Tosi, F., Zanuttigh, P., Mattoccia, S.: Confidence estimation for ToF and stereo sensors and its application to depth data fusion. IEEE Sensors J. 20(3), 1411–1421 (2020)
Article Google Scholar
Marin, G., Zanuttigh, P., Mattoccia, S.: “Reliable fusion of ToF and stereo depth driven by confidence measures,” In: Proc. Euro. Conf. Comput. Vis., Amsterdam, Netherlands, pp. 386–401 (2016)
Gandhi, V., C̆ech, J., Horaud, R.: “High-resolution depth maps based on TOF-stereo fusion,” In: Proc. IEEE Int. Conf. Robot. Autom., Saint Paul, MN, USA, pp. 4742–4749 (2012)
Zhu, J., Liang, W., Yang, R., Davis, J., Pan, Z.: Reliability fusion of time-of-flight depth and stereo geometry for high quality depth maps. IEEE Trans. Pattern Anal. Mach. Intell. 33(7), 1400–1414 (2011)
Article Google Scholar
Chen, B., Jung, C., Zhang, Z.: Variational fusion of time-of-flight and stereo data for depth estimation using edge-selective joint filtering. IEEE Trans. Multimed. 20(11), 2882–2890 (2018)
Article Google Scholar
Ma, F., Karaman, S.: “Sparse-to-dense: Depth prediction from sparse depth samples and a single image,” In: Proc. IEEE Int. Conf. Robot. Autom., Brisbane, Australia, pp. 4796–4803 (2018)
Shivakumar, S.S., Mohta, K., Pfrommer, B., Kumar, V., Taylor, C.J.: “Real time dense depth estimation by fusing stereo with sparse depth measurements,” In: Proc. IEEE Int. Conf. Robot. Autom., Montreal, Canada, pp. 6482–6488 (2019)
Zabih, R., Woodfill, J.: Non-parametric local transforms for computing visual correspondence. Proc. Euro. Conf. Comput. Vis., Stockholm, Sweden 801, 151–158 (1994)
Scharstein, D., Szeliski, R.: A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. Int. J. Comput. Vis. 47(1), 7–42 (2002)
Article MATH Google Scholar
Chang, J., Chen, Y.: “Pyramid stereo matching network,” In: Proc. IEEE Comput. Soc. Conf. Comput. Vision. Pattern. Recognit., Salt Lake City, UT, USA, pp. 5410–5418 (2018)
Huang, Y., Liu, Y., Wu, T., Su, H., et al.: “S3: Learnable sparse signal superdensity for guided depth estimation,” In: Proc. IEEE/CVF Comput. Vis. Pattern Recognit., pp. 16706–16716 (2021)
Wang, T., Hu, H., Lin, C., Tsai, Y., et al.: “3D LiDAR and stereo fusion using stereo matching network with conditional cost volume normalization,” In: Proc. IEEE/RSJ Int. Conf. Intell. Robot. Syst., Macau, China, Nov. pp. 5895–5902 (2019)

Download references

Acknowledgements

This work was supported in part by the National Natural Science Foundation of China under Grant 61973234, in part by the Tianjin Natural Science Foundation under Grant 18JCZDJC96700 and Grant 20JCYBJC00180, and in part by Tianjin Science Fund for Distinguished Young Scholars under Grant 19JCJQJC62100.

Author information

Authors and Affiliations

School of Control Science and Engineering, Tianjin Key Laboratory of Advanced Technology of Electrical Engineering and Energy, Tiangong University, Tianjin, 300387, China
Hongbao Mo, Baoquan Li & Wuxi Shi
Institute of Robotics and Automatic Information System, Tianjin Key Laboratory of Intelligent Robotics, Nankai University, Tianjin, 300071, China
Xuebo Zhang

Authors

Hongbao Mo
View author publications
You can also search for this author in PubMed Google Scholar
Baoquan Li
View author publications
You can also search for this author in PubMed Google Scholar
Wuxi Shi
View author publications
You can also search for this author in PubMed Google Scholar
Xuebo Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Baoquan Li.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Mo, H., Li, B., Shi, W. et al. Cross-based dense depth estimation by fusing stereo vision with measured sparse depth. Vis Comput 39, 4339–4350 (2023). https://doi.org/10.1007/s00371-022-02594-z

Download citation

Accepted: 04 June 2022
Published: 03 August 2022
Issue Date: September 2023
DOI: https://doi.org/10.1007/s00371-022-02594-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Cross-based dense depth estimation by fusing stereo vision with measured sparse depth

Abstract

Access this article

Similar content being viewed by others

Monocular 3D Object Detection with Depth from Motion

Dense 3D Mapping for Indoor Environment Based on Kinect-Style Depth Cameras

Efficient Sparse to Dense Stereo Matching Technique

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Cross-based dense depth estimation by fusing stereo vision with measured sparse depth

Abstract

Access this article

Similar content being viewed by others

Monocular 3D Object Detection with Depth from Motion

Dense 3D Mapping for Indoor Environment Based on Kinect-Style Depth Cameras

Efficient Sparse to Dense Stereo Matching Technique

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation