Skip to main content
Log in

Gated Recurrent Fusion UNet for Depth Completion

  • Published:
Neural Processing Letters Aims and scope Submit manuscript

Abstract

Recently, image guided depth completion has been widely explored and obtained remarkable performance. However, most of the existing image guided depth completion methods often fuse multilevel features or multimodal features by performing simple element-wise addition or concatenation, hence hindering the fusion effectiveness. To address this issue, we propose a gated recurrent fusion UNet for more effective feature fusion. Specifically, a gated recurrent model is used to adaptively select and fuse useful information from multilevel features. Moreover, we introduce a dimension-wise attention based CSPN++ module to iteratively refine depth estimation, which incorporates dimension-wise attention to fully learn feature interdependencies to obtain a more accurate neighborhood affinity matrix for spatial propagation. Comprehensive experimental results demonstrate that the proposed method outperforms many state-of-the-art methods in terms of both quantitative evaluations and visual qualities.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  1. Zheng WL, Shen SC, Lu BL (2017) Online depth image-based object tracking with sparse representation and object detection. Neural Process Lett 45(3):745–758

    Article  Google Scholar 

  2. Zhao Z, Huang Z, Chai X, Wang J (2022) Depth enhanced cross-modal cascaded network for RGB-D salient object detection. Neural Process Lett 1–24

  3. Li T, Lin H, Dong X, Zhang X (2020) Depth image super-resolution using correlation-controlled color guidance and multi-scale symmetric network. Pattern Recognit 107:107513

    Article  Google Scholar 

  4. Uhrig J, Schneider N, Schneider L, Franke U, Brox T, Geiger A (2017) Sparsity invariant cnns. In: International conference on 3D vision (3DV), pp 11–20

  5. Eldesokey A, Felsberg M, Khan, FS (2018) Propagating confidences through cnns for sparse data regression. arXiv preprint arXiv:1805.11913

  6. Chodosh N, Wang C, Lucey S (2018) Deep convolutional compressed sensing for lidar depth completion. In: Asian conference on computer vision, pp 499–513

  7. Zhang Y, Funkhouser T (2018) Deep depth completion of a single rgb-d image. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 175–185

  8. Qiu J, Cui Z, Zhang Y, Zhang X, Liu S, Zeng B, Pollefeys M (2019) Deeplidar: deep surface normal guided depth prediction for outdoor scene from sparse lidar data and single color image. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 3313–3322

  9. Xu Y, Zhu X, Shi J, Zhang G, Bao H, Li H (2019) Depth completion from sparse lidar data with depth-normal constraints. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 2811–2820

  10. Ma F, Karaman S (2018) Sparse-to-dense: depth prediction from sparse depth samples and a single image. In: IEEE international conference on robotics and automation (ICRA), pp 4796–4803

  11. Ma F, Cavalheiro GV, Karaman S (2019) Self-supervised sparse-to-dense: self-supervised depth completion from lidar and monocular camera. In: International conference on robotics and automation (ICRA), pp 3288–3295

  12. Eldesokey A, Felsberg M, Khan FS (2019) Confidence propagation through cnns for guided sparse depth regression. IEEE Trans Pattern Anal Mach Intell 42(10):2423–2436

    Article  Google Scholar 

  13. Zhao S, Gong M, Fu H, Tao D (2021) Adaptive context-aware multi-modal network for depth completion. IEEE Trans Image Process 30:5264–5276

    Article  Google Scholar 

  14. Zou N, Xiang Z, Chen Y (2020) RSDCN: a road semantic guided sparse depth completion network. Neural Process Lett 51(3):2737–2749

    Article  Google Scholar 

  15. Cheng X, Wang P, Yang R (2018) Depth estimation via affinity learned with convolutional spatial propagation network. In: Proceedings of the European conference on computer vision (ECCV), pp 103–119

  16. Cheng X, Wang P, Guan C, Yang R (2020) Cspn++: learning context and resource aware convolutional spatial propagation networks for depth completion. In: Proceedings of the AAAI conference on artificial intelligence, pp 10615–10622

  17. Hu M, Wang S, Li B, Ning S, Fan L, Gong X (2021) Penet: towards precise and efficient image guided depth completion. In: IEEE international conference on robotics and automation (ICRA), pp 13656–13662

  18. Cho K, Van Merriënboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y (2014) Learning phrase representations using RNN encoder–decoder for statistical machine translation. arXiv preprint arXiv:1406.1078

  19. Arefin MR, Michalski V, St-Charles PL, Kalaitzis A, Kim S, Kahou SE, Bengio Y (2020) Multi-image super-resolution for remote sensing using deep recurrent networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp 206–207

  20. Zhang Y, Li K, Li K, Wang L, Zhong B, Fu Y (2018) Image super-resolution using very deep residual channel attention networks. In: Proceedings of the European conference on computer vision (ECCV), pp 286–301

  21. Dai T, Cai J, Zhang Y, Xia ST, Zhang L (2019) Second-order attention network for single image super-resolution. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 11065–11074

  22. Xiang X, Ren W, Qiu Y, Zhang K, Lv N (2021) Multi-object tracking method based on efficient channel attention and switchable atrous convolution. Neural Process Lett 53(4):2747–2763

    Article  Google Scholar 

  23. Silberman N, Hoiem D, Kohli P, Fergus R (2012) Indoor segmentation and support inference from rgbd images. In: European conference on computer vision, pp 746–760

  24. Herrera CD, Kannala J, Heikkilä J (2013) Depth map inpainting under a second-order smoothness prior. In: Scandinavian conference on image analysis, pp 555–566

  25. Schneider N, Schneider L, Pinggera P, Franke U, Pollefeys M, Stiller C (2016) Semantically guided depth upsampling. In: German conference on pattern recognition, pp 37–48

  26. Barron JT, Poole B (2016) The fast bilateral solver. In: European conference on computer vision, pp 617–632

  27. Tang J, Tian FP, Feng W, Li J, Tan P (2020) Learning guided convolutional network for depth completion. IEEE Trans Image Process 30:1116–1129

    Article  Google Scholar 

  28. Perona P, Malik J (1990) Scale-space and edge detection using anisotropic diffusion. IEEE Trans Pattern Anal Mach Intell 12(7):629–639

    Article  Google Scholar 

  29. Wang Y, Ren W, Wang H (2013) Anisotropic second and fourth order diffusion models based on convolutional virtual electric field for image denoising. Comput Math Appl 66(10):1729–1742

    Article  MathSciNet  MATH  Google Scholar 

  30. Park J, Joo K, Hu Z, Liu CK, So Kweon I (2020) Non-local spatial propagation network for depth completion. In: European conference on computer vision, pp 120–136

  31. Ronneberger O, Fischer P, Brox T (2015) U-net: convolutional networks for biomedical image segmentation. In: International conference on medical image computing and computer-assisted intervention, pp 234–241

  32. Sha G, Wu J, Yu B (2021) A robust segmentation method based on improved U-Net. Neural Process Lett 53(4):2947–2965

    Article  Google Scholar 

  33. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778

  34. Ballas N, Yao L, Pal C, Courville A (2015) Delving deeper into convolutional networks for learning video representations. arXiv preprint arXiv:1511.06432

  35. Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980

  36. Khan AH, Cao X, Li S, Katsikis VN, Liao L (2020) BAS-ADAM: an ADAM based approach to improve the performance of beetle antennae search optimizer. IEEE J Autom Sin 7(2):461–471

    Article  Google Scholar 

  37. Li A, Yuan Z, Ling Y, Chi W, Zhang C (2020) A multi-scale guided cascade hourglass network for depth completion. In: Proceedings of the IEEE winter conference on applications of computer vision, pp 32–40

  38. Song S, Lichtenberg SP, Xiao J (2015) Sun rgb-d: a rgb-d scene understanding benchmark suite. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 567–576

  39. Huang Z, Fan J, Cheng S, Yi S, Wang X, Li H (2019) Hms-net: hierarchical multi-scale sparsity-invariant network for sparse depth completion. IEEE Trans on Image Process 29:3429–3441

    Article  MATH  Google Scholar 

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China (Grant Nos. 61901392 and 62041109), the Department of Science and Technology of Sichuan Province (Grant No. 2021YJ0109).

Author information

Authors and Affiliations

Authors

Contributions

TL: Conceptualization, Methodology, Software, Formal Analysis, Writing—Original Draft; XD: Resources, Supervision; HL: Visualization, Writing—Review.

Corresponding author

Correspondence to Tao Li.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, T., Dong, X. & Lin, H. Gated Recurrent Fusion UNet for Depth Completion. Neural Process Lett 55, 10463–10481 (2023). https://doi.org/10.1007/s11063-023-11334-w

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11063-023-11334-w

Keywords

Navigation