Abstract
Depth completion techniques fuse sparse depth map from LiDAR with color image to generate accurate dense depth map. Typically, multi-modal techniques utilize complementary characteristics of each modality, overcoming the limited information from a single modality. Especially in the depth completion, LiDAR data has relatively dense depth information for objects in the near distance but lacks the information of distant object and its boundary. On the other hand, color image has dense information for objects even in the far distance including the object boundary. Thus, the complementary characteristics of the two modalities are well suited for fusion, and many depth completion studies have proposed fusion networks to address the sparsity of LiDAR data. However, the previous fusion networks tend to simply concatenate the two-modality data and rely on deep neural network to extract useful features, not considering the inherited characteristics of each modality. To enable the effective modality-aware fusion, we propose a confidence guidance module (CGM) that estimates confidence maps which emphasizes salient region for each modality. In experiment, we showed that the confidence map for LiDAR data focused on near area and object surface, while those for color image focused on distant area and object boundary. Also, we propose a shallow feature fusion module (SFFM) to combine two types of input modality. Furthermore, a parallel refinement stage for each modality is proposed to reduce the computation time. Our results showed that the proposed model showed much faster computation time and competitive performance compared to the top-ranked models on the KITTI depth completion online leaderboard.
Y. Lee and S. Park—These authors contributed equally.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Uhrig, J., Schneider, N., Schneider, L., Franke, U., Brox, T., Geiger, A.: Sparsity invariant CNNs. In: 2017 International Conference on 3D Vision (3DV), pp. 11–20. IEEE (2017)
Cheng, X., Wang, P., Yang, R.: Learning depth with convolutional spatial propagation network. IEEE Trans. Pattern Anal. Mach. Intell. 42, 2361–2379 (2019)
Cheng, X., Wang, P., Yang, R.: Depth estimation via affinity learned with convolutional spatial propagation network. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 103–119 (2018)
Park, J., Joo, K., Hu, Z., Liu, C.-K., So Kweon, I.: Non-local spatial propagation network for depth completion. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12358, pp. 120–136. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58601-0_8
Cheng, X., Wang, P., Guan, C., Yang, R.: CSPN++: learning context and resource aware convolutional spatial propagation networks for depth completion. Proc. AAAI Conf. Artif. Intell. 34, 10615–10622 (2020)
Xu, Y., Zhu, X., Shi, J., Zhang, G., Bao, H., Li, H.: Depth completion from sparse lidar data with depth-normal constraints. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 2811–2820 (2019)
Liu, L., et al.: FCFR-Net: feature fusion based coarse-to-fine residual learning for monocular depth completion. arXiv preprint arXiv:2012.08270 (2020)
Lee, S., Lee, J., Kim, D., Kim, J.: Deep architecture with cross guidance between single image and sparse lidar data for depth completion. IEEE Access 8, 79801–79810 (2020)
Tang, J., Tian, F.P., Feng, W., Li, J., Tan, P.: Learning guided convolutional network for depth completion. IEEE Trans. Image Process. 30, 1116–1129 (2020)
Van Gansbeke, W., Neven, D., De Brabandere, B., Van Gool, L.: Sparse and noisy lidar completion with RGB guidance and uncertainty. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6. IEEE (2019)
Qiu, J., et al.: Deeplidar: deep surface normal guided depth prediction for outdoor scene from sparse lidar data and single color image. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3313–3322 (2019)
Hu, M., Wang, S., Li, B., Ning, S., Fan, L., Gong, X.: PENet: towards precise and efficient image guided depth completion. In: 2021 International Conference on Robotics and Automation (ICRA), pp. 13656–13662. IEEE (2021)
Li, A., Yuan, Z., Ling, Y., Chi, W., Zhang, C., et al.: A multi-scale guided cascade hourglass network for depth completion. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 32–40 (2020)
Lee, B.U., Lee, K., Kweon, I.S.: Depth completion using plane-residual representation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13916–13925 (2021)
Zhao, S., Gong, M., Fu, H., Tao, D.: Adaptive context-aware multi-modal network for depth completion. IEEE Trans. Image Process. 30, 5264–5276 (2021)
Ma, F., Cavalheiro, G.V., Karaman, S.: Self-supervised sparse-to-dense: self-supervised depth completion from lidar and monocular camera. In: 2019 International Conference on Robotics and Automation (ICRA), pp. 3288–3295. IEEE (2019)
Kanopoulos, N., Vasanthavada, N., Baker, R.L.: Design of an image edge detection filter using the Sobel operator. IEEE J. Solid-State Circuits 23, 358–367 (1988)
Geiger, A., Lenz, P., Stiller, C., Urtasun, R.: Vision meets robotics: the KITTI dataset. Int. J. Robot. Res. 32, 1231–1237 (2013)
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Acknowledgements
This work was conducted by Center for Applied Research in Artificial Intelligence(CARAI) grant funded by Defense Acquisition Program Administration(DAPA) and Agency for Defense Development(ADD) (UD190031RD).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Lee, Y., Park, S., Kang, B., Park, H. (2023). Multi-modal Characteristic Guided Depth Completion Network. In: Wang, L., Gall, J., Chin, TJ., Sato, I., Chellappa, R. (eds) Computer Vision – ACCV 2022. ACCV 2022. Lecture Notes in Computer Science, vol 13843. Springer, Cham. https://doi.org/10.1007/978-3-031-26313-2_36
Download citation
DOI: https://doi.org/10.1007/978-3-031-26313-2_36
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-26312-5
Online ISBN: 978-3-031-26313-2
eBook Packages: Computer ScienceComputer Science (R0)