Skip to main content
Log in

Estimation of Kinect depth confidence through self-training

  • Original Article
  • Published:
The Visual Computer Aims and scope Submit manuscript

Abstract

All depth data captured by Kinect devices are noisy, and sometimes even lost or shifted, especially around the edges of the depth. In this paper, we propose an approach to generate a per-pixel confidence measurement for each depth map captured by Kinect devices in indoor environments through supervised learning. Several distinguishing features from both the color images and depth maps are selected to train depth map estimators using Random Forest regressor. Using this estimator, we can predict a confidence map of any depth map captured by Kinect devices. Usage of other devices, such as an industrial laser scanner, is unnecessary, making the implementation more convenient. The experiments demonstrate precise confidence prediction of the depth.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  1. Nguyen, C.V., Izadi, S., Lovell, D.: Modeling kinect sensor noise for improved 3d reconstruction and tracking. In: 2012 Second International Conference on 3D Imaging, Modeling, Processing, Visualization and Transmission (3DIMPVT), pp. 524–530. IEEE, New York (2012)

  2. Herrera, C., Kannala, J., et al.: Joint depth and color camera calibration with distortion correction. IEEE Trans. Pattern Anal. Mach. Intell. 34(10), 2058–2064 (2012)

    Article  Google Scholar 

  3. Khoshelham, K., Elberink, S.O.: Accuracy and resolution of kinect depth data for indoor mapping applications. Sensors 12(2), 1437–1454 (2012)

    Google Scholar 

  4. Reynolds, M., Dobos, J., Peel, L., Weyrich, T., Brostow, G.J.: Capturing time-of-flight data with confidence. In: 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 945–952. IEEE, New York (2011)

  5. Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)

    MATH  Google Scholar 

  6. Shotton, J., Sharp, T., Kipman, A., Fitzgibbon, A., Finocchio, M., Blake, A., Cook, M., Moore, R.: Real-time human pose recognition in parts from single depth images. Commun. ACM 56(1), 116–124 (2013)

    Article  Google Scholar 

  7. Scaramuzza, D., Harati, A., Siegwart, R.: Extrinsic self calibration of a camera and a 3d laser range finder from natural scenes. In: IEEE/RSJ International Conference on Intelligent Robots and Systems, 2007. IROS 2007, pp. 4164–4169. IEEE, New York (2007)

  8. Unnikrishnan, R., Hebert, M.: Fast extrinsic calibration of a laser rangefinder to a camera. (2005)

  9. Zhang, Q., Pless, R.: Extrinsic calibration of a camera and laser range finder (improves camera calibration). In: 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2004. (IROS 2004). Proceedings, vol. 3, pp. 2301–2306. IEEE, New York (2004)

  10. Herrera, D., Kannala, J., Heikkilä, J.: Accurate and practical calibration of a depth and color camera pair. In: Computer Analysis of Images and Patterns, pp. 437–445. Springer, Berlin (2011)

  11. Smisek, J., Jancosek, M., Pajdla, T.: 3d with kinect. In: Consumer Depth Cameras for Computer Vision, pp. 3–25. Springer, Berlin (2013)

  12. Tam, W.J., Zhang, L.: Nonuniform smoothing of depth maps before image-based rendering. In: Optics East. International Society for Optics and Photonics, pp. 173–183 (2004)

  13. Chen, W.-Y., Chang, Y.-L., Lin, S.-F., Ding, L.-F., Chen, L.-G.: Efficient depth image based rendering with edge dependent depth filter and interpolation. In: IEEE International Conference on Multimedia and Expo, 2005, ICME 2005, pp. 1314–1317. IEEE, New York (2005)

  14. Lee, S.-B., Ho, Y.-S.: Discontinuity-adaptive depth map filtering for 3d view generation. In: Proceedings of the 2nd International Conference on Immersive Telecommunications. ICST (Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering), p. 8 (2009)

  15. Kim, S.-Y., Cho, J.-H., Koschan, A., Abidi, M.A.: Spatial and temporal enhancement of depth images captured by a time-of-flight depth sensor. In: 2010 20th International Conference on Pattern Recognition (ICPR), pp. 2358–2361. IEEE, New York (2010)

  16. Liu, X., Yang, X., Zhang, H.: Fusion of depth maps based on confidence. In: 2011 International Conference on Electronics, Communications and Control (ICECC), pp. 2658–2661. IEEE, New York (2011)

  17. May, S., Droeschel, D., Holz, D., Wiesen, C., Fuchs, S.: 3d pose estimation and mapping with time-of-flight cameras. In: International Conference on Intelligent Robots and Systems (IROS), 3D Mapping workshop, Nice, France (2008)

  18. Daugman, J.G.: Two-dimensional spectral analysis of cortical receptive field profiles. Vis. Res. 20(10), 847–856 (1980)

    Article  Google Scholar 

  19. Ojala, T., Pietikainen, M., Harwood, D.: Performance evaluation of texture measures with classification based on kullback discrimination of distributions. In: Proceedings of the 12th IAPR International Conference on Pattern Recognition. Vol. 1-Conference A: Computer Vision & Image Processing, vol. 1, pp. 582–585. IEEE, New York (1994)

Download references

Acknowledgments

The authors gratefully acknowledge the anonymous reviewers for their comments to help us to improve our paper, and also thank for their enormous help in revising this paper. This work is supported by NSF of China (Nos.U1035004, 61173070, 61202149), Key Projects in the National Science & Technology Pillar Program (No. 2013BAH39F00).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Fan Zhong.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Song, X., Zhong, F., Wang, Y. et al. Estimation of Kinect depth confidence through self-training. Vis Comput 30, 855–865 (2014). https://doi.org/10.1007/s00371-014-0965-y

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00371-014-0965-y

Keywords

Navigation