Estimation of Kinect depth confidence through self-training

Song, Xibin; Zhong, Fan; Wang, Yanke; Qin, Xueying

doi:10.1007/s00371-014-0965-y

Estimation of Kinect depth confidence through self-training

Original Article
Published: 11 May 2014

Volume 30, pages 855–865, (2014)
Cite this article

The Visual Computer Aims and scope Submit manuscript

Xibin Song¹,
Fan Zhong¹,
Yanke Wang¹ &
…
Xueying Qin^1,2

1870 Accesses
16 Citations
Explore all metrics

Abstract

All depth data captured by Kinect devices are noisy, and sometimes even lost or shifted, especially around the edges of the depth. In this paper, we propose an approach to generate a per-pixel confidence measurement for each depth map captured by Kinect devices in indoor environments through supervised learning. Several distinguishing features from both the color images and depth maps are selected to train depth map estimators using Random Forest regressor. Using this estimator, we can predict a confidence map of any depth map captured by Kinect devices. Usage of other devices, such as an industrial laser scanner, is unnecessary, making the implementation more convenient. The experiments demonstrate precise confidence prediction of the depth.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Object detection using YOLO: challenges, architectural successors, datasets and applications

Article 08 August 2022

Tausif Diwan, G. Anirudh & Jitendra V. Tembhurne

Deep learning-based 3D reconstruction: a survey

Article 28 January 2023

Taha Samavati & Mohsen Soryani

BEVFormer: Learning Bird’s-Eye-View Representation from Multi-camera Images via Spatiotemporal Transformers

References

Nguyen, C.V., Izadi, S., Lovell, D.: Modeling kinect sensor noise for improved 3d reconstruction and tracking. In: 2012 Second International Conference on 3D Imaging, Modeling, Processing, Visualization and Transmission (3DIMPVT), pp. 524–530. IEEE, New York (2012)
Herrera, C., Kannala, J., et al.: Joint depth and color camera calibration with distortion correction. IEEE Trans. Pattern Anal. Mach. Intell. 34(10), 2058–2064 (2012)
Article Google Scholar
Khoshelham, K., Elberink, S.O.: Accuracy and resolution of kinect depth data for indoor mapping applications. Sensors 12(2), 1437–1454 (2012)
Google Scholar
Reynolds, M., Dobos, J., Peel, L., Weyrich, T., Brostow, G.J.: Capturing time-of-flight data with confidence. In: 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 945–952. IEEE, New York (2011)
Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)
MATH Google Scholar
Shotton, J., Sharp, T., Kipman, A., Fitzgibbon, A., Finocchio, M., Blake, A., Cook, M., Moore, R.: Real-time human pose recognition in parts from single depth images. Commun. ACM 56(1), 116–124 (2013)
Article Google Scholar
Scaramuzza, D., Harati, A., Siegwart, R.: Extrinsic self calibration of a camera and a 3d laser range finder from natural scenes. In: IEEE/RSJ International Conference on Intelligent Robots and Systems, 2007. IROS 2007, pp. 4164–4169. IEEE, New York (2007)
Unnikrishnan, R., Hebert, M.: Fast extrinsic calibration of a laser rangefinder to a camera. (2005)
Zhang, Q., Pless, R.: Extrinsic calibration of a camera and laser range finder (improves camera calibration). In: 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2004. (IROS 2004). Proceedings, vol. 3, pp. 2301–2306. IEEE, New York (2004)
Herrera, D., Kannala, J., Heikkilä, J.: Accurate and practical calibration of a depth and color camera pair. In: Computer Analysis of Images and Patterns, pp. 437–445. Springer, Berlin (2011)
Smisek, J., Jancosek, M., Pajdla, T.: 3d with kinect. In: Consumer Depth Cameras for Computer Vision, pp. 3–25. Springer, Berlin (2013)
Tam, W.J., Zhang, L.: Nonuniform smoothing of depth maps before image-based rendering. In: Optics East. International Society for Optics and Photonics, pp. 173–183 (2004)
Chen, W.-Y., Chang, Y.-L., Lin, S.-F., Ding, L.-F., Chen, L.-G.: Efficient depth image based rendering with edge dependent depth filter and interpolation. In: IEEE International Conference on Multimedia and Expo, 2005, ICME 2005, pp. 1314–1317. IEEE, New York (2005)
Lee, S.-B., Ho, Y.-S.: Discontinuity-adaptive depth map filtering for 3d view generation. In: Proceedings of the 2nd International Conference on Immersive Telecommunications. ICST (Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering), p. 8 (2009)
Kim, S.-Y., Cho, J.-H., Koschan, A., Abidi, M.A.: Spatial and temporal enhancement of depth images captured by a time-of-flight depth sensor. In: 2010 20th International Conference on Pattern Recognition (ICPR), pp. 2358–2361. IEEE, New York (2010)
Liu, X., Yang, X., Zhang, H.: Fusion of depth maps based on confidence. In: 2011 International Conference on Electronics, Communications and Control (ICECC), pp. 2658–2661. IEEE, New York (2011)
May, S., Droeschel, D., Holz, D., Wiesen, C., Fuchs, S.: 3d pose estimation and mapping with time-of-flight cameras. In: International Conference on Intelligent Robots and Systems (IROS), 3D Mapping workshop, Nice, France (2008)
Daugman, J.G.: Two-dimensional spectral analysis of cortical receptive field profiles. Vis. Res. 20(10), 847–856 (1980)
Article Google Scholar
Ojala, T., Pietikainen, M., Harwood, D.: Performance evaluation of texture measures with classification based on kullback discrimination of distributions. In: Proceedings of the 12th IAPR International Conference on Pattern Recognition. Vol. 1-Conference A: Computer Vision & Image Processing, vol. 1, pp. 582–585. IEEE, New York (1994)

Download references

Acknowledgments

The authors gratefully acknowledge the anonymous reviewers for their comments to help us to improve our paper, and also thank for their enormous help in revising this paper. This work is supported by NSF of China (Nos.U1035004, 61173070, 61202149), Key Projects in the National Science & Technology Pillar Program (No. 2013BAH39F00).

Author information

Authors and Affiliations

School of Computer Science and Technology, Shandong University, Jinan, People’s Republic of China
Xibin Song, Fan Zhong, Yanke Wang & Xueying Qin
Shandong Provincial Key Laboratory of Network Based Intelligent Computing, Jinan, People’s Republic of China
Xueying Qin

Authors

Xibin Song
View author publications
You can also search for this author in PubMed Google Scholar
Fan Zhong
View author publications
You can also search for this author in PubMed Google Scholar
Yanke Wang
View author publications
You can also search for this author in PubMed Google Scholar
Xueying Qin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Fan Zhong.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Song, X., Zhong, F., Wang, Y. et al. Estimation of Kinect depth confidence through self-training. Vis Comput 30, 855–865 (2014). https://doi.org/10.1007/s00371-014-0965-y

Download citation

Published: 11 May 2014
Issue Date: June 2014
DOI: https://doi.org/10.1007/s00371-014-0965-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Estimation of Kinect depth confidence through self-training

Abstract

Access this article

Similar content being viewed by others

Object detection using YOLO: challenges, architectural successors, datasets and applications

Deep learning-based 3D reconstruction: a survey

BEVFormer: Learning Bird’s-Eye-View Representation from Multi-camera Images via Spatiotemporal Transformers

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Estimation of Kinect depth confidence through self-training

Abstract

Access this article

Similar content being viewed by others

Object detection using YOLO: challenges, architectural successors, datasets and applications

Deep learning-based 3D reconstruction: a survey

BEVFormer: Learning Bird’s-Eye-View Representation from Multi-camera Images via Spatiotemporal Transformers

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation