Abstract
In recent years, fusion camera systems that consist of color cameras and Time-of-Flight (TOF) depth sensors have been popularly used due to its depth sensing capability at real-time frame rates. However, captured depth maps are limited in low resolution compared to the corresponding color images due to physical limitation of the TOF depth sensor. Most approaches to enhancing the resolution of captured depth maps depend on the implicit assumption that when neighboring pixels in the color image have similar values, they are also similar in depth. Although many algorithms have been proposed, they still yield erroneous results, especially when region boundaries in the depth map and the color image are not aligned. We therefore propose a novel kernel regression framework to generate the high quality depth map. Our proposed filter is based on the vector pointing similar pixels that represents the unit vector toward similar neighbors in the local region. The vectors are used to detect misaligned regions between color edges and depth edges. Unlike conventional kernel regression methods, our method properly handles misaligned regions by introducing the numerical analysis of the local structure into the kernel regression framework. Experimental comparisons with other data fusion techniques prove the superiority of the proposed algorithm.













Similar content being viewed by others
References
Boehme, M., Haker, M., Martinetz, T., & Barth, E. (2008). A facial feature tracker for human-computer interaction based on 3D ToF cameras. Journal of Intelligent Systems Technologies and Applications, 5(3–4), 264–273.
Kollorz, E., Penne, J., Hornegger, J., & Barke, A. (2008). Gesture recognition with a time-of-flight camera. Journal of Intelligent Systems Technologies and Applications, 5(3–4), 334–343.
Inoue, H., Tachikawa, T., & Inaba, M. (1992). Robot vision system with a correlation chip for real-time tracking, optical flow and depth map generation. Proceedings of the IEEE International Conference on Robotics and Automation, 5, 1621–1626.
Hussmann, S., & Liepert, T. (2009). Three-dimensional TOF robot vision system. IEEE Transactions on Instrumentation and Measurement, 58(1), 141–146.
Kim, Y.M., Theobalt, C., Diebel, J., Kosecka, J., Micusik, B., & Thrun, S. (2009). Multi-view image and ToF sensor fusion for dense 3D reconstruction. Proceedings of the IEEE International Conference on Computer Vision Workshops (ICCV Workshops). pp. 1542-1549.
Jang, I. Y., & Lee, K. H. (2010). Depth video based human model reconstruction resolving self-occlusion. IEEE Transactions on Consumer Electronics, 56(3), 1933–1941.
Curless, B. (1997). New methods for surface reconstruction from range image. Ph.D. dissertation, Technical Report CSLTR-97-733, Stanford University.
Cheng, X., Zhang, H., & Xie, R. (2010). Study on 3D laser scanning modeling method for large-scale history building. Proceedings of the International Conference on Computer Application and System Modeling (ICCASM), 7, 573–577.
Nagahara, D., & Takahashi, S. (2010). Mobile robot control based on information of the scanning laser range sensor. Proceedings of the 11th IEEE International Workshop on Advanced Motion Control. pp. 258–261.
Smith, P. W., Nandhakumar, N., & Chien, C. H. (1997). Object motion and structure recovery for robotic vision using scanning laser range sensors. IEEE Transactions on Robotics and Automation, 13(1), 74–80.
Yoon, K. J., & Kweon, I. S. (2006). Adaptive support-weight approach for correspondence search. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(4), 650–656.
Boykov, Y., Veksler, O., & Zabih, R. (1998). A variable window approach to early vision. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(12), 1283–1294.
Ohta, Y., & Kanade, T. (1985). Stereo by inter- and inter-scanline search. IEEE Transactions on Pattern Analysis and Machine Intelligence, 7(2), 139–154.
Diebel, J., & Thrun, S. (2005). An application of markov random fields to range sensing. Advances in Neural Information Processing Systems, 18, 291–298.
Zhu, J., Wang, L., Yang, R., & Davis, J. (2008). Fusion of time-of-flight depth and stereo for high accuracy depth maps. Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition. pp. 1–8.
Zhu, J., Wang, L., Gao, J., & Yang, R. (2010). Spatial-temporal fusion for high accuracy depth maps using dynamic MRFs. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(5), 899–909.
Kopf, J., Cohen, M., Lischinski, D., & Uyttendaele, M. (2007). Joint bilateral upsampling. ACM Transactions on Graphics, 26, 3.
Chan, D., Buisman, H., Theobalt, C., & Thrun, S. (2008). A noise-aware filter for real-time depth upsampling. Proceedings of ECCV Workshop on Multicamera and Multimodal Sensor Fusion Algorithms and Applications. pp. 1–12.
Yang, Q. X., Yang, R. G., Davis, J. E., & Nister, D. (2007). Spatial-depth super resolution for range images. Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition. pp. 1–8.
Riemens, A. K., Gangwal, O. P., Barenbrug, B., & Berretty, R-P. M. (2009). Multi-step joint bilateral depth upsampling. Proceedings of the SPIE (Visual Communications and Image signal Processing). 7257.
Garcia, F., Mirbach, B., Ottersten, B., Grandidier, F., & Cuesta, A. (2010). Pixel weighted average strategy for depth sensor data fusion. Proceedings of the IEEE International Conference on Image Processing, pp. 2805–2808.
Takeda, H., Farsiu, S., & Milanfar, P. (2007). Kernel regression for image processing and reconstruction. IEEE Transactions on Image Processing, 16(2), 349–366.
Eisemann, E., & Durand, F. (2004). Flash photography enhancement via intrinsic relighting. ACM Transactions on Graphics, 23(3), 673–678.
Yu, H., Zhao, L., & Wang, H. (2009). Image denoising using trivariate shrinkage filter in the wavelet domain and joint bilateral filter in the spatial domain. IEEE Transactions on Image Processing, 18(10), 2364–2369.
Park, J., Kim, H., Tai, Y. W., Brown, M. S., & Kwon, I. (2011) High Quality Depth Map Upsampling for 3D-ToF Cameras. Proceedings of the IEEE International Conference on Computer Vision. pp. 1623–1630.
Kim, D., & Yoon, K. (2012). High-quality Depth Map Up-sampling Robust to Edge Noise of Range Sensors. Proceedings of the IEEE International Conference on Image Processing. pp. 553–556.
Tomasi, E. C., & Manduchi, R. (1998). Bilateral filtering for gray and color images. Proceedings of the IEEE International Conference on Compute Vision. pp. 836–846.
Zhang, B., & Allebach, J. P. (2008). Adaptive bilateral filter for sharpness enhancement and noise removal. IEEE Transactions on Image Processing, 17(5), 664–678.
Lie, W. N., Chen, C. Y., & Chen, W. C. (2011). 2D to 3D video conversion with Key-frame depth propagation and trilateral filtering. Electronics Letters, 47(5), 319–321.
Varekamp, C., & Barenbrug, B. (2007). Improved depth propagation for 2D to 3D video conversion using key-frames. Proceedings of the 4th European Conference on Visual Media production (IETCVMP). pp. 1–7.
Rafael, C. G., & Richard, E. W. (2002). Digital Image Processing. Prentice Hall. 2nd edition.
Porikli, F. (2008). Constant Time O(1) Bilateral Filtering. Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition, pp. 1–8.
Paris, S., & Durand, F. (2006) A fast approximation of the bilateral filter using a signal processing approach. International Journal of Computer Vision, 81(1), 24–52.
Yang, Q., Tan, K. H., & Ahuja, N. (2009). Real Time O(1) Bilateral Filtering. Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition. pp. 557–564.
Viola, P., & Joens, M. (2001). Robust real-time face detection. Proceedings of the IEEE International Conference on Computer Vision, pp. 747–750.
Crow, F. (1984). Summed-area tables for texture mapping. Proceedings of the Siggraph, 18(no. 3), 207–212.
Deriche, R. (1992). Recursively implementing the Gaussian and its derivatives. Proceeding IEEE International Conference on Image Processing. pp. 263–267.
Acknowledgement
This research was supported by a grant from the R&D Program (Industrial Strategic Technology Development) funded by the Ministry of Knowledge Economy (MKE), Republic of Korea. Also, The authors are deeply thankful to all interested persons of MKE and KEIT (Korea Evaluation Institute of Industrial Technology).
Author information
Authors and Affiliations
Corresponding author
About this article
Cite this article
Kim, J., Lee, J., Han, SR. et al. A High Quality Depth Map Upsampling Method Robust to Misalignment of Depth and Color Boundaries. J Sign Process Syst 75, 23–37 (2014). https://doi.org/10.1007/s11265-013-0783-x
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11265-013-0783-x