Skip to main content

Exploiting Dual-Correlation for Multi-frame Time-of-Flight Denoising

  • Conference paper
  • First Online:
Computer Vision – ECCV 2024 (ECCV 2024)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 15080))

Included in the following conference series:

  • 319 Accesses

Abstract

Recent advancements in Time-of-Flight (ToF) depth denoising have achieved impressive results in removing Multi-Path Interference (MPI) and shot noise. However, existing methods only utilize a single frame of ToF data, neglecting the correlation between frames. In this paper, we propose the first learning-based framework for multi-frame ToF denoising. Different from existing methods, our framework leverages the correlation between neighboring frames to guide ToF noise removal with a confidence map. Specifically, we introduce a Dual-Correlation Estimation Module, which exploits both intra- and inter-correlation. The intra-correlation explicitly establishes the relevance between the spatial positions of geometric objects within the scene, aiding in depth residual initialization. The inter-correlation discerns variations in ToF noise distribution across different frames, thereby locating the regions with strong ToF noise. To further leverage dual-correlation, we introduce a Confidence-guided Residual Regression Module to predict a confidence map, which guides the residual regression to prioritize the regions with strong ToF noise. The experimental evaluations have consistently shown that our framework outperforms existing ToF denoising methods, highlighting its superior performance in effectively reducing strong ToF noise. The source code is available at https://github.com/gtdong-ustc/multi-frame-tof-denoising.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Abadi, M., et al.: TensorFlow: a system for large-scale machine learning. In: Symposium on Operating Systems Design and Implementation, pp. 265–283 (2016)

    Google Scholar 

  2. Agresti, G., Schaefer, H., Sartor, P., Zanuttigh, P.: Unsupervised domain adaptation for ToF data denoising with adversarial learning. In: CVPR, pp. 5584–5593 (2019)

    Google Scholar 

  3. Agresti, G., Schäfer, H., Sartor, P., Incesu, Y., Zanuttigh, P.: Unsupervised domain adaptation of deep networks for ToF depth refinement. IEEE Trans. Pattern Anal. Mach. Intell. 44(12), 9195–9208 (2021)

    Article  Google Scholar 

  4. Bako, S., et al.: Kernel-predicting convolutional networks for denoising monte Carlo renderings. ACM Trans. Graph. (TOG) 36(4), 97 (2017)

    Article  Google Scholar 

  5. Bhandari, A., Feigin, M., Izadi, S., Rhemann, C., Schmidt, M., Raskar, R.: Resolving multipath interference in Kinect: an inverse problem approach. In: Sensors, pp. 614–617. IEEE (2014)

    Google Scholar 

  6. Buratto, E., Simonetto, A., Agresti, G., Schäfer, H., Zanuttigh, P.: Deep learning for transient image reconstruction from ToF data. Sensors 21(6), 1962 (2021)

    Article  Google Scholar 

  7. Dong, G., Zhang, Y., Xiong, Z.: Spatial hierarchy aware residual pyramid network for time-of-flight depth denoising. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) Computer Vision – ECCV 2020, pp. 35–50. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58586-0_3

    Chapter  Google Scholar 

  8. Duzceker, A., Galliani, S., Vogel, C., Speciale, P., Dusmanu, M., Pollefeys, M.: DeepVideoMVS: multi-view stereo on video with recurrent spatio-temporal fusion. In: CVPR, pp. 15324–15333 (2021)

    Google Scholar 

  9. Fan, L., Xiong, X., Wang, F., Wang, N., Zhang, Z.: RangeDet: in defense of range view for lidar-based 3D object detection. In: ICCV, pp. 2918–2927 (2021)

    Google Scholar 

  10. Freedman, D., Smolin, Y., Krupka, E., Leichter, I., Schmidt, M.: SRA: fast removal of general multipath for ToF sensors. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) Computer Vision – ECCV 2014, pp. 234–249. Springer International Publishing, Cham (2014). https://doi.org/10.1007/978-3-319-10590-1_16

    Chapter  Google Scholar 

  11. Guo, Q., Frosio, I., Gallo, O., Zickler, T., Kautz, J.: Tackling 3D ToF artifacts through learning and the flat dataset. In: ECCV, pp. 368–383 (2018)

    Google Scholar 

  12. Gupta, M., Nayar, S.K., Hullin, M.B., Martin, J.: Phasor imaging: a generalization of correlation-based time-of-flight imaging. ACM Trans. Graph. (TOG) 34(5), 156 (2015)

    Article  Google Scholar 

  13. Gutierrez-Barragan, F., Chen, H., Gupta, M., Velten, A., Gu, J.: iToF2dToF: a robust and flexible representation for data-driven time-of-flight imaging. IEEE Trans. Comput. Imag. 7, 1205–1214 (2021)

    Article  Google Scholar 

  14. Illade-Quinteiro, J., Brea, V.M., López, P., Cabello, D., Doménech-Asensi, G.: Distance measurement error in time-of-flight sensors due to shot noise. Sensors 15(3), 4624–4642 (2015)

    Article  Google Scholar 

  15. Jarabo, A., Marco, J., Muñoz, A., Buisan, R., Jarosz, W., Gutierrez, D.: A framework for transient rendering. ACM Trans. Graph. (TOG) 33(6), 177 (2014)

    Article  Google Scholar 

  16. Jung, H., et al.: Is my depth ground-truth good enough? Hammer–highly accurate multi-modal dataset for dense 3D scene regression (2022). arXiv preprint arXiv:2205.04565

  17. Jung, J., Lee, J.Y., Jeong, Y., Kweon, I.S.: Time-of-flight sensor calibration for a color and depth camera pair. IEEE Trans. Pattern Anal. Mach. Intell. 37(7), 1501–1513 (2014)

    Article  Google Scholar 

  18. Lenzen, F., Schäfer, H., Garbe, C.: Denoising time-of-flight data with adaptive total variation. In: Bebis, G., et al. (eds.) Advances in Visual Computing, pp. 337–346. Springer Berlin Heidelberg, Berlin, Heidelberg (2011). https://doi.org/10.1007/978-3-642-24028-7_31

    Chapter  Google Scholar 

  19. Li, Z., et al.: Temporally consistent online depth estimation in dynamic scenes. In: WACV, pp. 3018–3027 (2023)

    Google Scholar 

  20. Lin, J., Liu, Y., Hullin, M.B., Dai, Q.: Fourier analysis on transient imaging with a multifrequency time-of-flight camera. In: CVPR, pp. 3230–3237 (2014)

    Google Scholar 

  21. Marco, J., et al.: DeepToF: off-the-shelf real-time correction of multipath interference in time-of-flight imaging. ACM Trans. Graph. (TOG) 36(6), 219 (2017)

    Article  Google Scholar 

  22. Mildenhall, B., Barron, J.T., Chen, J., Sharlet, D., Ng, R., Carroll, R.: Burst denoising with kernel prediction networks. In: CVPR, pp. 2502–2510 (2018)

    Google Scholar 

  23. Niklaus, S., Liu, F.: Softmax splatting for video frame interpolation. In: CVPR, pp. 5437–5446 (2020)

    Google Scholar 

  24. Patil, V., Van Gansbeke, W., Dai, D., Van Gool, L.: Don’t forget the past: recurrent depth estimation from monocular video. IEEE Robot. Autom. Lett. 5(4), 6813–6820 (2020)

    Article  Google Scholar 

  25. Piao, Y., Ji, W., Li, J., Zhang, M., Lu, H.: Depth-induced multi-scale recurrent attention network for saliency detection. In: ICCV, pp. 7254–7263 (2019)

    Google Scholar 

  26. Qiao, S., Zhu, Y., Adam, H., Yuille, A., Chen, L.C.: ViP-DeepLab: learning visual perception with depth-aware video panoptic segmentation. In: CVPR, pp. 3997–4008 (2021)

    Google Scholar 

  27. Qiu, D., Pang, J., Sun, W., Yang, C.: Deep end-to-end alignment and refinement for time-of-flight RGB-D module. In: ICCV, pp. 9994–10003 (2019)

    Google Scholar 

  28. Reading, C., Harakeh, A., Chae, J., Waslander, S.L.: Categorical depth distribution network for monocular 3D object detection. In: CVPR, pp. 8555–8564 (2021)

    Google Scholar 

  29. Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28

    Chapter  Google Scholar 

  30. Schelling, M., Hermosilla, P., Ropinski, T.: RADU: ray-aligned depth update convolutions for ToF data denoising. In: CVPR, pp. 671–680 (2022)

    Google Scholar 

  31. Simonetto, A., Agresti, G., Zanuttigh, P., Schäfer, H.: Lightweight deep learning architecture for MPI correction and transient reconstruction. IEEE Trans. Comput. Imaging 8, 721–732 (2022)

    Article  Google Scholar 

  32. Su, S., Heide, F., Wetzstein, G., Heidrich, W.: Deep end-to-end time-of-flight imaging. In: CVPR, pp. 6383–6392 (2018)

    Google Scholar 

  33. Sun, D., Yang, X., Liu, M.Y., Kautz, J.: PWC-Net: CNNs for optical flow using pyramid, warping, and cost volume. In: CVPR, pp. 8934–8943 (2018)

    Google Scholar 

  34. Sun, Z., et al.: Consistent direct time-of-flight video depth super-resolution. In: CVPR, pp. 5075–5085 (2023)

    Google Scholar 

  35. Tu, J., et al.: Physically realizable adversarial examples for lidar object detection. In: CVPR, pp. 13716–13725 (2020)

    Google Scholar 

  36. Wang, F., et al.: Residual attention network for image classification. In: CVPR, pp. 3156–3164 (2017)

    Google Scholar 

  37. Wang, X., Zhou, W., Jia, Y.: Attention GAN for multipath error removal from ToF sensors. IEEE Sens. J. 22(20), 19713–19721 (2022)

    Article  Google Scholar 

  38. Wang, X., Chan, K.C., Yu, K., Dong, C., Change Loy, C.: EDVR: video restoration with enhanced deformable convolutional networks. In: CVPRW (2019)

    Google Scholar 

  39. Watson, J., Mac Aodha, O., Prisacariu, V., Brostow, G., Firman, M.: The temporal opportunist: self-supervised multi-frame monocular depth. In: CVPR, pp. 1164–1174 (2021)

    Google Scholar 

  40. Wei, Y., Liu, S., Rao, Y., Zhao, W., Lu, J., Zhou, J.: NerfingMVS: guided optimization of neural radiance fields for indoor multi-view stereo. In: ICCV, pp. 5610–5619 (2021)

    Google Scholar 

  41. Xiao, Z., Liu, Y., Gao, R., Xiong, Z.: CutMIB: boosting light field super-resolution via multi-view image blending. In: CVPR, pp. 1672–1682 (2023)

    Google Scholar 

  42. Xiao, Z., Weng, W., Zhang, Y., Xiong, Z.: EVA2: event-assisted video frame interpolation via cross-modal alignment and aggregation. IEEE Trans. Comput. Imaging 8, 1145–1158 (2022)

    Article  Google Scholar 

  43. Yin, Z., Shi, J.: GeoNet: unsupervised learning of dense depth, optical flow and camera pose. In: CVPR, pp. 1983–1992 (2018)

    Google Scholar 

  44. Zanuttigh, P., Marin, G., Dal Mutto, C., Dominio, F., Minto, L., Cortelazzo, G.M.: Time-of-flight and structured light depth cameras. Technology and Applications, ISSBN pp. 978–3 (2016)

    Google Scholar 

  45. Zhang, X., Yan, H., Zhou, Q.: Overcoming the shot-noise limitation of three-dimensional active imaging. Opt. Lett. 36(8), 1434–1436 (2011)

    Article  Google Scholar 

  46. Zhu, S., Brazil, G., Liu, X.: The edge of depth: explicit constraints between segmentation and depth. In: CVPR, pp. 13116–13125 (2020)

    Google Scholar 

  47. Zhu, X., et al.: Cylindrical and asymmetrical 3D convolution networks for lidar segmentation. In: CVPR, pp. 9939–9948 (2021)

    Google Scholar 

Download references

Acknowledgements

This work was supported in part by the National Natural Science Foundation of China under Grants 62032006, 62131003 and 62021001.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yueyi Zhang .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 2806 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Dong, G., Zhang, Y., Sun, X., Xiong, Z. (2025). Exploiting Dual-Correlation for Multi-frame Time-of-Flight Denoising. In: Leonardis, A., Ricci, E., Roth, S., Russakovsky, O., Sattler, T., Varol, G. (eds) Computer Vision – ECCV 2024. ECCV 2024. Lecture Notes in Computer Science, vol 15080. Springer, Cham. https://doi.org/10.1007/978-3-031-72670-5_27

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-72670-5_27

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-72669-9

  • Online ISBN: 978-3-031-72670-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics