Abstract
It is widely believed that sparse supervision is worse than dense supervision in the field of depth completion, but the underlying reasons for this are rarely discussed. To this end, we revisit the task of radar-camera depth completion and present a new method with sparse LiDAR supervision to outperform previous dense LiDAR supervision methods in both accuracy and speed. Specifically, when trained by sparse LiDAR supervision, depth completion models usually output depth maps containing significant stripe-like artifacts. We find that such a phenomenon is caused by the implicitly learned positional distribution pattern from sparse LiDAR supervision, termed as LiDAR Distribution Leakage (LDL) in this paper. Based on such understanding, we present a novel Disruption-Compensation radar-camera depth completion framework to address this issue. The Disruption part aims to deliberately disrupt the learning of LiDAR distribution from sparse supervision, while the Compensation part aims to leverage 3D spatial and 2D semantic information to compensate for the information loss of previous disruptions. Extensive experimental results demonstrate that by reducing the impact of LDL, our framework with sparse supervision outperforms the state-of-the-art dense supervision methods with 11.6\(\mathbf {\%}\) improvement in Mean Absolute Error (MAE) and \(\mathbf {1.6 \times }\) speedup in Frame Per Second (FPS). The code is available at https://github.com/megvii-research/Sparse-Beats-Dense.
H. Li and M. Jing—Equal contribution.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
These object-level mask annotations will be released as well, along with the code.
References
Caesar, H., et al.: nuScenes: a multimodal dataset for autonomous driving. In: CVPR, pp. 11621–11631 (2020)
Chabra, R., et al.: Deep local shapes: learning local SDF priors for detailed 3D reconstruction. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12374, pp. 608–625. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58526-6_36
Chen, X., Zhang, T., Wang, Y., Wang, Y., Zhao, H.: FUTR3D: a unified sensor fusion framework for 3D detection. In: CVPR, pp. 172–181 (2023)
Chen, Z., et al.: Vision transformer adapter for dense predictions. arXiv preprint arXiv:2205.08534 (2022)
Chen, Z., Zhang, H.: Learning implicit fields for generative shape modeling. In: CVPR, pp. 5939–5948 (2019)
Cheng, X., Wang, P., Yang, R.: Depth estimation via affinity learned with convolutional spatial propagation network. In: ECCV, pp. 103–119 (2018)
Gasperini, S., Koch, P., Dallabetta, V., Navab, N., Busam, B., Tombari, F.: R4DYN: exploring radar for self-supervised monocular depth estimation of dynamic scenes. In: 3DV, pp. 751–760. IEEE (2021)
Girshick, R.: Fast R-CNN. In: ICCV, pp. 1440–1448 (2015)
Hartley, R., Zisserman, A.: Multiple View Geometry in Computer Vision. Cambridge University Press, Cambridge (2003)
Hu, M., Wang, S., Li, B., Ning, S., Fan, L., Gong, X.: Penet: towards precise and efficient image guided depth completion. In: ICRA, pp. 13656–13662. IEEE (2021)
Hu, X., Mu, H., Zhang, X., Wang, Z., Tan, T., Sun, J.: Meta-SR: a magnification-arbitrary network for super-resolution. In: CVPR, pp. 1575–1584 (2019)
Huang, Y., Zheng, W., Zhang, Y., Zhou, J., Lu, J.: Tri-perspective view for vision-based 3D semantic occupancy prediction. In: CVPR, pp. 9223–9232 (2023)
Imran, S., Liu, X., Morris, D.: Depth completion with twin surface extrapolation at occlusion boundaries. In: CVPR, pp. 2583–2592 (2021)
Kim, A., Ošep, A., Leal-Taixé, L.: Eagermot: 3D multi-object tracking via sensor fusion. In: ICRA, pp. 11315–11321. IEEE (2021)
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: ICLR (2015)
Lin, J.T., Dai, D., Van Gool, L.: Depth estimation from monocular images and sparse radar data. In: IROS, pp. 10233–10240. IEEE (2020)
Lin, T.Y., Dollár, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: CVPR, pp. 2117–2125 (2017)
Lin, Y., Cheng, T., Zhong, Q., Zhou, W., Yang, H.: Dynamic spatial propagation network for depth completion. In: AAAI, vol. 36, pp. 1638–1646 (2022)
Liu, Y., Wang, T., Zhang, X., Sun, J.: PETR: position embedding transformation for multi-view 3D object detection. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) ECCV 2022. LNCS, vol. 13687, pp. 531–548. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-19812-0_31
Lo, C.C., Vandewalle, P.: Depth estimation from monocular images and sparse radar using deep ordinal regression network. In: ICIP, pp. 3343–3347. IEEE (2021)
Long, Y., Morris, D., Liu, X., Castro, M., Chakravarty, P., Narayanan, P.: Radar-camera pixel depth association for depth completion. In: CVPR, pp. 12507–12516 (2021)
Ma, F., Karaman, S.: Sparse-to-dense: depth prediction from sparse depth samples and a single image. In: ICRA, pp. 4796–4803. IEEE (2018)
Mescheder, L., Oechsle, M., Niemeyer, M., Nowozin, S., Geiger, A.: Occupancy networks: learning 3D reconstruction in function space. In: CVPR, pp. 4460–4470 (2019)
Mildenhall, B., Srinivasan, P.P., Tancik, M., Barron, J.T., Ramamoorthi, R., Ng, R.: Nerf: representing scenes as neural radiance fields for view synthesis. Commun. ACM 65(1), 99–106 (2021)
Park, J.J., Florence, P., Straub, J., Newcombe, R., Lovegrove, S.: Deepsdf: learning continuous signed distance functions for shape representation. In: CVPR, pp. 165–174 (2019)
Philion, J., Fidler, S.: Lift, splat, shoot: encoding images from arbitrary camera rigs by implicitly unprojecting to 3D. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12359, pp. 194–210. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58568-6_12
Qiu, J., et al.: Deeplidar: deep surface normal guided depth prediction for outdoor scene from sparse lidar data and single color image. In: CVPR, pp. 3313–3322 (2019)
Qureshi, A.H., Simeonov, A., Bency, M.J., Yip, M.C.: Motion planning networks. In: ICRA, pp. 2118–2124. IEEE (2019)
Rho, K., Ha, J., Kim, Y.: Guideformer: transformers for image guided depth completion. In: CVPR, pp. 6250–6259 (2022)
Singh, A.D., et al.: Depth estimation from camera image and mmwave radar point cloud. In: CVPR, pp. 9275–9285 (2023)
Sitzmann, V., Martel, J., Bergman, A., Lindell, D., Wetzstein, G.: Implicit neural representations with periodic activation functions. NIPS 33, 7462–7473 (2020)
Skolnik, M.I.: Introduction to Radar Systems. New York (1980)
Wang, T.H., Wang, F.E., Lin, J.T., Tsai, Y.H., Chiu, W.C., Sun, M.: Plug-and-play: improve depth estimation via sparse data propagation. arXiv preprint arXiv:1812.08350 (2018)
Weinstein, L.: Electromagnetic waves. Radio i svyaz’, Moscow (1988)
Xie, S., Girshick, R., Dollár, P., Tu, Z., He, K.: Aggregated residual transformations for deep neural networks. In: CVPR, pp. 1492–1500 (2017)
Yan, Z., Wang, K., Li, X., Zhang, Z., Li, J., Yang, J.: RigNet: repetitive image guided network for depth completion. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) ECCV 2022. LNCS, vol. 13687, pp. 214–230. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-19812-0_13
Zhang, Y., Guo, X., Poggi, M., Zhu, Z., Huang, G., Mattoccia, S.: Completionformer: depth completion with convolutions and vision transformers. In: CVPR, pp. 18527–18536 (2023)
Zhou, B., et al.: Semantic understanding of scenes through the ade20k dataset. Int. J. Comput. Vis. 127, 302–321 (2019)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Li, H. et al. (2025). Sparse Beats Dense: Rethinking Supervision in Radar-Camera Depth Completion. In: Leonardis, A., Ricci, E., Roth, S., Russakovsky, O., Sattler, T., Varol, G. (eds) Computer Vision – ECCV 2024. ECCV 2024. Lecture Notes in Computer Science, vol 15107. Springer, Cham. https://doi.org/10.1007/978-3-031-72967-6_8
Download citation
DOI: https://doi.org/10.1007/978-3-031-72967-6_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-72966-9
Online ISBN: 978-3-031-72967-6
eBook Packages: Computer ScienceComputer Science (R0)