Skip to main content
Log in

Cross-based dense depth estimation by fusing stereo vision with measured sparse depth

  • Original article
  • Published:
The Visual Computer Aims and scope Submit manuscript

Abstract

Dense depth estimation is significant in robotic systems, such as for mapping, localization, and object recognition. For multiple sensors, an active depth sensor can provide accurate but sparse measurements for environments, and a camera pair can provide dense but imprecise stereo reconstruction results. In this paper, a tightly coupled fusion method is proposed for depth sensor and stereo camera to complete dense depth estimation, and advantages of the two type sensors are combined so as to achieve better depth estimation. An adaptive dynamic cross-arm algorithm are developed to integrate sparse depth measurements into camera-dominated semiglobal stereo matching. To obtain the optimal arm length for a measured pixel point, each cross-arm shape is variational and calculated automatically. Public datasets of KITTI, Middlebury, and Scene Flow datasets are used with comparison experiments to test performance of the proposed method, and real-world experiments are further conducted for verification.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Notes

  1. https://github.com/TGUMobileVision/CrossArmFusion.

References

  1. Choe, J., Joo, K., Imtiaz, T., Kweon, I.S.: Volumetric propagation network: stereo-LiDAR fusion for long-range depth estimation. IEEE Robot. Autom. Lett. 6(3), 4672–4679 (2021)

    Article  Google Scholar 

  2. Evangelidis, G.D., Hansard, M., Horaud, R.: Fusion of range and stereo data for high-resolution scene-modeling. IEEE Trans. Pattern Anal. Mach. Intell. 37(11), 2178–2192 (2015)

    Article  Google Scholar 

  3. Chen, S., Zhang, J., Jin, M.: A simplified ICA-based local similarity stereo matching. Vis. Comput. 37, 411–419 (2021)

    Article  Google Scholar 

  4. Hernandez-Juarez, D., Chacón, A., Espinosa, A., Vázquez, D., Moure, J.C., López, A.M.: Embedded real-time stereo estimation via semi-global matching on the GPU. Procedia Comput. Sci. 80, 143–153 (2016)

    Article  Google Scholar 

  5. Kraft, H., Frey, J., Moeller, T., Albrecht, M., Grothof, M., Schink, B., Hess, H., Buxbaum, B.: “3D-camera of high 3D-frame rate, depth-resolution and background light elimination based on improved PMD (photonic mixer device)-technologies,” In: OPTO, Nuernberg, (2004)

  6. Beder, C., Bartczak, B., Koch, R.: “A comparison of PMD-cameras and stereo-vision for the task of surface reconstruction using patchlets,” In: Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern. Recognit., Minneapolis, MN, USA, pp. 1–8 (2007)

  7. Jian, M., Dong, J., Gong, M., Yu, H., et al.: Learning the traditional art of Chinese calligraphy via three-dimensional reconstruction and assessment. IEEE Trans. Multimed. 22(4), 970–979 (2020)

    Article  Google Scholar 

  8. Scharstein, D., Szeliski, R.: A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. Int. J. Comput. Vis. 47, 7–42 (2002)

    Article  MATH  Google Scholar 

  9. Brown, M.Z., Burschka, D., Hager, G.D.: Advances in computational stereo. IEEE Trans. Pattern Anal. Mach. Intell. 25(8), 993–1008 (2003)

    Article  Google Scholar 

  10. Rother, C., Carlsson, S.: Linear multi view reconstruction and camera recovery using a reference plane. Int. J. Comput. Vis. 49, 117–141 (2002)

    Article  MATH  Google Scholar 

  11. Hirschmuller, H.: Stereo processing by semiglobal matching and mutual information. IEEE Trans. Pattern Anal. Mach. Intell. 30(2), 328–341 (2008)

    Article  Google Scholar 

  12. Zbontar, J., LeCun, Y.: Stereo matching by training a convolutional neural network to compare image patches. J. Mach. Learn. Res. 17(1), 2287–2318 (2016)

    MATH  Google Scholar 

  13. Hambarde, P., Murala, S.: S2DNet: Depth estimation from single image and sparse samples. IEEE Trans. Comput. Imag. 6, 806–817 (2020)

    Article  Google Scholar 

  14. Zhao, T., Pan, S., Gao, W., Sheng, C., Sun, Y., Wei, J.: Attention Unet++ for lightweight depth estimation from sparse depth samples and a single RGB image. Vis. Comput. (2021). https://doi.org/10.1007/s00371-021-02092-8

    Article  Google Scholar 

  15. Wang, T.H., Hu, H.N., Lin, C.H., Tsai, Y.-H., Chiu, W.-C., Sun, M.: “3D LiDAR and stereo fusion using stereo matching network with conditional cost volume normalization,” In: Proc. IEEE/RSJ Int. Conf. Intell. Robots Syst., Macau, China, pp. 5895–5902 (2019)

  16. Park, K., Kim, S., Sohn, K.: High-precision depth estimation using uncalibrated LiDAR and stereo fusion. IEEE Trans. Intell. Transp. Syst. 21(1), 321–335 (2020)

    Article  Google Scholar 

  17. Zhang, J., Ramanagopalg, M.S., Vasudevan, R., Johnson-Roberson, M.: “LiStereo: Generate dense depth maps from LiDAR and stereo imagery,” In: Proc. IEEE Int. Conf. Robot. Autom., Paris, France, pp. 7829–7836 (2020)

  18. Badino, H., Huber, D., Kanade, T.: “Integrating LiDAR into stereo for fast and improved disparity computation,” In: Proc. Int. Conf. 3D Imaging Model. Process. Vis. Transm., Hangzhou, China, pp. 405–412 (2011)

  19. Maddern, W., Newman, P.: “Real-time probabilistic fusion of sparse 3D LiDAR and dense stereo,” In: Proc IEEE/RSJ Int. Conf. Intell. Robots Syst., Daejeon, Korea, pp. 2181–2188 (2016)

  20. Veitch-Michaelis, J., Muller, J., Storey, J., Walton, D., Foster, M.: Data fusion of LiDAR into a region growing stereo algorithm. Int. Arch. Photogramm Remote Sens. Spat. Inf. Sci. 40(4), 107 (2015)

    Article  Google Scholar 

  21. Courtois, H., Aouf, N.: “Fusion of stereo and LiDAR data for dense depth map computation,” In: Proc. Workshop Res. Educ. Dev. Unman. Aerial Syst., Linkoping, Sweden, pp. 186–191 (2017)

  22. Premebida, C., Garrote, L., Asvadi, A., Ribeiro, A.P., Nunes, U.: “High-resolution LiDAR-based depth mapping using bilateral filter,” In: Proc. Int. Conf. Intell. Transp. Syst., Rio de Janeiro, Brazil, pp. 2469–2474 (2016)

  23. Kuhnert, K., Stommel, M.: “Fusion of stereo-camera and PMD-camera data for real-time suited precise 3D environment reconstruction,” In: Proc. IEEE/RSJ Int. Conf. Intell. Robots Syst., Beijing, China, pp. 4780–478 (2006)

  24. Shim, H., Adelsberger, R., Kim, J.D., Rhee, S.M., Rhee, T., Sim, J.Y., et al.: Time-of-flight sensor and color camera calibration for multi-view acquisition. The Vis. Comput. 28, 1139–1151 (2012)

    Article  Google Scholar 

  25. Agresti, G., Zanuttigh, P.: “Combination of spatially-modulated ToF and structured light for MPI-free depth estimation,” In: Proc. Euro. Conf. Comput. Vis., Munich, Germany, pp. 355–371 (2018)

  26. Zhao, T., Pan, S., Gao, W., Sun, Y.: Learning modal and spatial features with lightweight 3D convolution for RGB guided depth completion. IEEE Trans. Consum. Electron. 67(3), 195–201 (2020)

    Google Scholar 

  27. Cheng, X., Zhong, Y., Harandi, M., Dai, Y., et al.: Hierarchical neural architecture search for deep stereo matching. Adv. Neural Inform. Proc. Syst. 33, 22158–22169 (2020)

    Google Scholar 

  28. Oberle, W.F., Davis, L.: “Toward high resolution, ladar-quality 3D world models using ladar-stereo data integration and fusion,” Army Rese. Labor., (2005)

  29. Yang, Q., Tan, K., Culbertson, B., Apostolopoulos, J.: “Fusion of active and passive sensors for fast 3D capture,” In: Proc. IEEE Int. Workshop Multi. Sig., Saint-Malo, France, pp. 69–74 (2010)

  30. Poggi, M., Agresti, G., Tosi, F., Zanuttigh, P., Mattoccia, S.: Confidence estimation for ToF and stereo sensors and its application to depth data fusion. IEEE Sensors J. 20(3), 1411–1421 (2020)

    Article  Google Scholar 

  31. Marin, G., Zanuttigh, P., Mattoccia, S.: “Reliable fusion of ToF and stereo depth driven by confidence measures,” In: Proc. Euro. Conf. Comput. Vis., Amsterdam, Netherlands, pp. 386–401 (2016)

  32. Gandhi, V., C̆ech, J., Horaud, R.: “High-resolution depth maps based on TOF-stereo fusion,” In: Proc. IEEE Int. Conf. Robot. Autom., Saint Paul, MN, USA, pp. 4742–4749 (2012)

  33. Zhu, J., Liang, W., Yang, R., Davis, J., Pan, Z.: Reliability fusion of time-of-flight depth and stereo geometry for high quality depth maps. IEEE Trans. Pattern Anal. Mach. Intell. 33(7), 1400–1414 (2011)

    Article  Google Scholar 

  34. Chen, B., Jung, C., Zhang, Z.: Variational fusion of time-of-flight and stereo data for depth estimation using edge-selective joint filtering. IEEE Trans. Multimed. 20(11), 2882–2890 (2018)

    Article  Google Scholar 

  35. Ma, F., Karaman, S.: “Sparse-to-dense: Depth prediction from sparse depth samples and a single image,” In: Proc. IEEE Int. Conf. Robot. Autom., Brisbane, Australia, pp. 4796–4803 (2018)

  36. Shivakumar, S.S., Mohta, K., Pfrommer, B., Kumar, V., Taylor, C.J.: “Real time dense depth estimation by fusing stereo with sparse depth measurements,” In: Proc. IEEE Int. Conf. Robot. Autom., Montreal, Canada, pp. 6482–6488 (2019)

  37. Zabih, R., Woodfill, J.: Non-parametric local transforms for computing visual correspondence. Proc. Euro. Conf. Comput. Vis., Stockholm, Sweden 801, 151–158 (1994)

  38. Scharstein, D., Szeliski, R.: A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. Int. J. Comput. Vis. 47(1), 7–42 (2002)

    Article  MATH  Google Scholar 

  39. Chang, J., Chen, Y.: “Pyramid stereo matching network,” In: Proc. IEEE Comput. Soc. Conf. Comput. Vision. Pattern. Recognit., Salt Lake City, UT, USA, pp. 5410–5418 (2018)

  40. Huang, Y., Liu, Y., Wu, T., Su, H., et al.: “S3: Learnable sparse signal superdensity for guided depth estimation,” In: Proc. IEEE/CVF Comput. Vis. Pattern Recognit., pp. 16706–16716 (2021)

  41. Wang, T., Hu, H., Lin, C., Tsai, Y., et al.: “3D LiDAR and stereo fusion using stereo matching network with conditional cost volume normalization,” In: Proc. IEEE/RSJ Int. Conf. Intell. Robot. Syst., Macau, China, Nov. pp. 5895–5902 (2019)

Download references

Acknowledgements

This work was supported in part by the National Natural Science Foundation of China under Grant 61973234, in part by the Tianjin Natural Science Foundation under Grant 18JCZDJC96700 and Grant 20JCYBJC00180, and in part by Tianjin Science Fund for Distinguished Young Scholars under Grant 19JCJQJC62100.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Baoquan Li.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mo, H., Li, B., Shi, W. et al. Cross-based dense depth estimation by fusing stereo vision with measured sparse depth. Vis Comput 39, 4339–4350 (2023). https://doi.org/10.1007/s00371-022-02594-z

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00371-022-02594-z

Keywords

Navigation