Multi-modal Characteristic Guided Depth Completion Network

Lee, Yongjin; Park, Seokjun; Kang, Beomgu; Park, HyunWook

doi:10.1007/978-3-031-26313-2_36

Yongjin Lee¹²,
Seokjun Park¹²,
Beomgu Kang¹² &
…
HyunWook Park¹²

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13843))

Included in the following conference series:

Asian Conference on Computer Vision

668 Accesses

Abstract

Depth completion techniques fuse sparse depth map from LiDAR with color image to generate accurate dense depth map. Typically, multi-modal techniques utilize complementary characteristics of each modality, overcoming the limited information from a single modality. Especially in the depth completion, LiDAR data has relatively dense depth information for objects in the near distance but lacks the information of distant object and its boundary. On the other hand, color image has dense information for objects even in the far distance including the object boundary. Thus, the complementary characteristics of the two modalities are well suited for fusion, and many depth completion studies have proposed fusion networks to address the sparsity of LiDAR data. However, the previous fusion networks tend to simply concatenate the two-modality data and rely on deep neural network to extract useful features, not considering the inherited characteristics of each modality. To enable the effective modality-aware fusion, we propose a confidence guidance module (CGM) that estimates confidence maps which emphasizes salient region for each modality. In experiment, we showed that the confidence map for LiDAR data focused on near area and object surface, while those for color image focused on distant area and object boundary. Also, we propose a shallow feature fusion module (SFFM) to combine two types of input modality. Furthermore, a parallel refinement stage for each modality is proposed to reduce the computation time. Our results showed that the proposed model showed much faster computation time and competitive performance compared to the top-ranked models on the KITTI depth completion online leaderboard.

Y. Lee and S. Park—These authors contributed equally.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Uhrig, J., Schneider, N., Schneider, L., Franke, U., Brox, T., Geiger, A.: Sparsity invariant CNNs. In: 2017 International Conference on 3D Vision (3DV), pp. 11–20. IEEE (2017)
Google Scholar
Cheng, X., Wang, P., Yang, R.: Learning depth with convolutional spatial propagation network. IEEE Trans. Pattern Anal. Mach. Intell. 42, 2361–2379 (2019)
Article Google Scholar
Cheng, X., Wang, P., Yang, R.: Depth estimation via affinity learned with convolutional spatial propagation network. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 103–119 (2018)
Google Scholar
Park, J., Joo, K., Hu, Z., Liu, C.-K., So Kweon, I.: Non-local spatial propagation network for depth completion. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12358, pp. 120–136. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58601-0_8
Chapter Google Scholar
Cheng, X., Wang, P., Guan, C., Yang, R.: CSPN++: learning context and resource aware convolutional spatial propagation networks for depth completion. Proc. AAAI Conf. Artif. Intell. 34, 10615–10622 (2020)
Google Scholar
Xu, Y., Zhu, X., Shi, J., Zhang, G., Bao, H., Li, H.: Depth completion from sparse lidar data with depth-normal constraints. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 2811–2820 (2019)
Google Scholar
Liu, L., et al.: FCFR-Net: feature fusion based coarse-to-fine residual learning for monocular depth completion. arXiv preprint arXiv:2012.08270 (2020)
Lee, S., Lee, J., Kim, D., Kim, J.: Deep architecture with cross guidance between single image and sparse lidar data for depth completion. IEEE Access 8, 79801–79810 (2020)
Article Google Scholar
Tang, J., Tian, F.P., Feng, W., Li, J., Tan, P.: Learning guided convolutional network for depth completion. IEEE Trans. Image Process. 30, 1116–1129 (2020)
Article Google Scholar
Van Gansbeke, W., Neven, D., De Brabandere, B., Van Gool, L.: Sparse and noisy lidar completion with RGB guidance and uncertainty. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6. IEEE (2019)
Google Scholar
Qiu, J., et al.: Deeplidar: deep surface normal guided depth prediction for outdoor scene from sparse lidar data and single color image. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3313–3322 (2019)
Google Scholar
Hu, M., Wang, S., Li, B., Ning, S., Fan, L., Gong, X.: PENet: towards precise and efficient image guided depth completion. In: 2021 International Conference on Robotics and Automation (ICRA), pp. 13656–13662. IEEE (2021)
Google Scholar
Li, A., Yuan, Z., Ling, Y., Chi, W., Zhang, C., et al.: A multi-scale guided cascade hourglass network for depth completion. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 32–40 (2020)
Google Scholar
Lee, B.U., Lee, K., Kweon, I.S.: Depth completion using plane-residual representation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13916–13925 (2021)
Google Scholar
Zhao, S., Gong, M., Fu, H., Tao, D.: Adaptive context-aware multi-modal network for depth completion. IEEE Trans. Image Process. 30, 5264–5276 (2021)
Article Google Scholar
Ma, F., Cavalheiro, G.V., Karaman, S.: Self-supervised sparse-to-dense: self-supervised depth completion from lidar and monocular camera. In: 2019 International Conference on Robotics and Automation (ICRA), pp. 3288–3295. IEEE (2019)
Google Scholar
Kanopoulos, N., Vasanthavada, N., Baker, R.L.: Design of an image edge detection filter using the Sobel operator. IEEE J. Solid-State Circuits 23, 358–367 (1988)
Article Google Scholar
Geiger, A., Lenz, P., Stiller, C., Urtasun, R.: Vision meets robotics: the KITTI dataset. Int. J. Robot. Res. 32, 1231–1237 (2013)
Article Google Scholar
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)

Download references

Acknowledgements

This work was conducted by Center for Applied Research in Artificial Intelligence(CARAI) grant funded by Defense Acquisition Program Administration(DAPA) and Agency for Defense Development(ADD) (UD190031RD).

Author information

Authors and Affiliations

Korea Advanced Institute of Science and Technology, Daejeon, Republic of Korea
Yongjin Lee, Seokjun Park, Beomgu Kang & HyunWook Park

Authors

Yongjin Lee
View author publications
You can also search for this author in PubMed Google Scholar
Seokjun Park
View author publications
You can also search for this author in PubMed Google Scholar
Beomgu Kang
View author publications
You can also search for this author in PubMed Google Scholar
HyunWook Park
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to HyunWook Park .

Editor information

Editors and Affiliations

University of Wollongong, Wollongong, NSW, Australia
Lei Wang
University of Bonn, Bonn, Germany
Juergen Gall
University of Adelaide, Adelaide, SA, Australia
Tat-Jun Chin
National Institute of Informatics, Tokyo, Japan
Imari Sato
Johns Hopkins University, Baltimore, MD, USA
Rama Chellappa

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 6128 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lee, Y., Park, S., Kang, B., Park, H. (2023). Multi-modal Characteristic Guided Depth Completion Network. In: Wang, L., Gall, J., Chin, TJ., Sato, I., Chellappa, R. (eds) Computer Vision – ACCV 2022. ACCV 2022. Lecture Notes in Computer Science, vol 13843. Springer, Cham. https://doi.org/10.1007/978-3-031-26313-2_36

Download citation

DOI: https://doi.org/10.1007/978-3-031-26313-2_36
Published: 02 March 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-26312-5
Online ISBN: 978-3-031-26313-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics