Skip to main content
Log in

A New High Resolution Depth Map Estimation System Using Stereo Vision and Kinect Depth Sensing

  • Published:
Journal of Signal Processing Systems Aims and scope Submit manuscript

Abstract

Depth map estimation is an active and long standing problem in image/video processing and computer vision. Conventional depth estimation algorithms which rely on stereo/multi-view vision or depth sensing devices alone are limited by complicated scenes or imperfections of the depth sensing devices. On the other hand, the depth maps obtained from the stereo/multi-view vision and depth sensing devices are de facto complementary to each other. This motivates us to develop in this paper a new system for high resolution and high quality depth estimation by joint fusion of stereo and Kinect data. We modeled the observations using Markov random field (MRF) and formulated the fusion problem as a maximum a posteriori probability (MAP) estimation problem. The reliability and the probability density functions for describing the observations from the two devices are also derived. The MAP problem is solved using a multiscale belief propagation (BP) algorithm. To suppress possible estimation noise, the depth map estimated is further refined by color image guided depth matting and a 2D polynomial regression (LPR)-based filtering. Experimental results and numerical comparisons show that our system can provide high quality and high resolution depth maps, thanks to the complementary strengths of both stereo vision and Kinect depth sensors.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9
Figure 10

Similar content being viewed by others

References

  1. Smisek, J., Jancosek, M., Pajdla, T. (2011). “3D with kinect.” In Proc. IEEE Workshop Consum. Depth Cameras Comput. Vision (pp. 1154–1160).

  2. Zhang, Z. Y. (2000). A flexible new technique for camera calibration. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(11), 1330–1334.

    Article  Google Scholar 

  3. Shum, H. Y., Chan, S. C., & Kang, S. B. (2007). Image-based rendering. NY: Springer.

    Google Scholar 

  4. Khoshelham, K., & Oude Elberink, S. (2012). Accuracy and resolution of Kinect depth data for indoor mapping applications. Sensors, 12(2), 1437–1454.

    Article  Google Scholar 

  5. Herrera, D., Kannala, C. J., & Heikkila, J. (2012). Joint depth and color camera calibration with distortion correction. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(10), 2058–2064.

    Article  Google Scholar 

  6. Zhang, C., & Zhang, Z. (2011). “Calibration between depth and color sensors for commodity depth cameras.” In Proc. IEEE Intl. Workshop Hot Topics in 3D, in conjunction with ICME.

  7. Zhang, L., & Seitz, S. (2005). Parameter estimation for MRF stereo. Proc. IEEE Comput. Soc. Conf. CVPR, 2, 288–295.

    Google Scholar 

  8. Yoon, K. J., & Kweon, I. S. (2005). “Locally adaptive support-weight approach for visual correspondence search.” In Proc. IEEE Intl Conf. Computer Vision and Pattern Recognition (pp. 924–931).

  9. Zhu, Z. Y., Zhang, S., Chan, S. C., & Shum, H. Y. (2012). Object-based rendering and 3D reconstruction using a moveable image-based system. IEEE Trans. Circuits Syst. Video Technol., 22(10), 1405–1419.

    Article  Google Scholar 

  10. Yang, Q. X. (2012). “A non-local cost aggregation method for stereo matching.” In Proc. IEEE Comput. Soc. Conf. CVPR (pp. 1402–1409).

  11. Felzenszwalb, P. F., & Huttenlocher, D. P. (2006). Efficient belief propagation for early vision. Intl. Journal. Comput. Vision, 70(1), 41–54.

    Article  Google Scholar 

  12. Zhu, J. J., Wang, L., Yang, R. G., Davis, J. E., & Pan, Z. G. (2011). Reliability fusion of time-of-flight depth and stereo geometry for high quality depth maps. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(7), 1400–1414.

    Article  Google Scholar 

  13. Chuang, Y., Curless, B., Salesin, D. H., Szeliski, R. (2001). “A Bayesian approach to digital matting.” In Proc. IEEE Comput. Soc. Conf. CVPR, vol. 2 (pp. 264–271).

  14. Chan, S. C., Shum, H. Y., & Ng, K. T. (2007). Image-based rendering and synthesis: Technological advances and challenges. IEEE Signal Processing Magazine, 24(6), 22–33.

    Article  Google Scholar 

  15. Foix, S., Alenyà, G., & Torras, C. (2011). Lock-in time-of-flight (ToF) cameras: A survey. IEEE Sensors Journal, 11(9), 1917–1926.

    Article  Google Scholar 

  16. Levoy, M., Pulli, K., Curless, B., Rusinkiewicz, S., Koller, D., Pereira, L., Ginzton, M., Anderson, S., Davis, J., Ginsberg, J., Shade, J., Fulk, D. (2000). The digital michelangelo project: 3D scanning of large statues. In Proc. Annu. Comput. Graph (pp. 131–144).

  17. Ikeuchi, K., Nakazawa, A., Hasegawa, K., Ohishi, T. (2003). The great Buddha project: Modeling cultural heritage for VR systems through observation. In Proc. IEEE/ACM Intl. Symp. Mixed Augmented Reality (pp. 7–16).

  18. Wang, Z., & Zheng, Z. (2008). A region based stereo matching algorithm using cooperative optimization. In Proc. IEEE Comput. Soc. Conf. CVPR, vol. 1, no. 12 (pp. 2720–2727).

  19. Boykov, Y., Veksler, O., & Zabih, R. (2001). Fast approximate energy minimization via graph cuts. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(11), 1222–1239.

    Article  Google Scholar 

  20. Sun, J., Li, Y., Kang, S. B., Shum, H. Y. (2005). “Symmetric stereo matching for occlusion handling.” In Proc. IEEE Comput. Soc. Conf. CVPR, vol. 2 (pp. 399–406).

  21. Herrera, C. D., Kannala, J., Heikkila, J. (2011). “Accurate and practical calibration of a depth and color camera pair.” In proc. Intl. Conf. Computer Analysis of Images and Pattern, vol. II, LNCS 6855 (pp. 437–445).

  22. Klaus, A., Sormann, M., Karner, K. (2006). “Segment-based stereo matching using belief propagation and a self-adapting dissimilarity measure.” In Proc. IEEE Int. Conf. Pattern Recognition, vol. 3 (pp. 15–18).

  23. Lazaros, N., Sirakoulis, G. C., Gasteratos, A. (2008). “Review of stereo vision algorithms: from software to hardware.” International Journal of Optomechatronics, 435–462.

  24. Bleyer, M., Rother, C., Kohli, P. (2010). “Surface stereo with soft segmentation.” In Proc. IEEE Comput. Soc. Conf. CVPR (pp. 1570–1577).

  25. Taguchi, Y., Wilburn, B., & Zitnick, L. (2008). Stereo reconstruction with mixed pixels using adaptive over-segmentation. Proc. IEEE Comput. Soc. Conf. CVPR, 1(12), 2720–2727.

    Google Scholar 

  26. Birchfield, S., & Tomasi, C. (1998). A pixel dissimilarity measure that is insensitive to image sampling. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20, 401–406.

    Article  Google Scholar 

  27. Lindner, M., Kolb, A., Hartmann, K. (2007). “Data-fusion of PMD-based distance-information and high-resolution RGB-images.” In Proc. Intl. Symp. Signals, Circuits, and Systems (pp. 1–4).

  28. Chiu, W., Blanke, U., Fritz, M. (2011). “Improving the Kinect by cross-model stereo.” In Proc. British Mach. Vision Conf.

  29. Chan, D. Y., & Hsu, C. H. (2013). “Regular stereo matching improvement system based on Kinect-supporting mechanism.” Open Journal Applied Sciences, 22–26.

  30. Yang, Q., Wang, L., Yang, R., Stewénius, H., & Nistér, D. (2009). Stereo matching with color-weighted correlation, hierarchical belief propagation and occlusion handling. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(1), 492–504.

    Article  Google Scholar 

  31. Strecha, C., Fransens, R., Gool, L. V. (2006). “Combined depth and outlier estimation in multi-view stereo.” In Proc. IEEE Comput. Soc. Conf. CVPR (pp. 2394–2401).

  32. Zhang, S., Wang, C., Chan, S. C. (2013). “A new high resolution depth map estimation system using stereo vision and depth sensing device.” In Proc. IEEE colloq. Signal Process. Applications (pp. 49–53).

  33. Zhang, Z. G., Chan, S. C., Ho, K. L., & Ho, K. C. (2008). On bandwidth selection in local polynomial regression analysis and its application to multi-resolution analysis of non-uniform data. J. Signal Process. Syst. Signal Image and Video Technol., 52(3), 263–280.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to S. C. Chan.

Additional information

Part of this work was presented in IEEE Colloquium on Signal Processing and its Applications 2013 [32]. This project is supported in parts by a GRF grant from the Hong Kong Research Grant Council and a tier-3 grant from the Hong Kong Innovative Technology Fund (ITF).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, S., Wang, C. & Chan, S.C. A New High Resolution Depth Map Estimation System Using Stereo Vision and Kinect Depth Sensing. J Sign Process Syst 79, 19–31 (2015). https://doi.org/10.1007/s11265-013-0821-8

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11265-013-0821-8

Keywords

Navigation