A non-parametric depth modification model for registration between color and depth images

Peng, Li; Zhang, Yanduo; Zhou, Huabing; Jiang, Junjun; Ma, Jiayi

doi:10.1007/s11045-018-0599-8

A non-parametric depth modification model for registration between color and depth images

Published: 11 June 2018

Volume 30, pages 1129–1148, (2019)
Cite this article

Multidimensional Systems and Signal Processing Aims and scope Submit manuscript

Li Peng^1,2,3,
Yanduo Zhang²,
Huabing Zhou²,
Junjun Jiang⁴ &
…
Jiayi Ma⁵

295 Accesses
2 Citations
Explore all metrics

Abstract

Despite its most popularity among all depth cameras in the computer vision applications, the Microsoft Kinect sensor suffers from low depth accuracy. In this work we propose a novel non-parametric depth modification model to improve the depth accuracy of the Kinect sensor by iteratively registering depth images and color images. In particular, we first establish a coarse correspondence based on the feature descriptor of the canny edge at each iteration, and estimate the fine correspondence using an $L_2E$ algorithm. We utilize the non-parametric Gaussian mixture model to replace the Gaussian single model and build the regularization term to constrain the correlations between functions. Then, based on the correspondence results, the depth data are corrected and optimized. Extensive experiments have been performed to verify the effectiveness of the proposed approach, and the results have demonstrated that our method is able to greatly enhance the depth accuracy of the Kinect sensor compared with baseline methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Color and Depth Image Correspondence for Kinect v2

A new depth image quality metric using a pair of color and depth images

Article 05 March 2016

Notes

We implemented RANSAC based on the publicly available code at http://www.robots.ox.ac.uk/~vgg/hzbook/code/.
We implemented CPD based on the publicly available code at http://www.bme.ogi.edu/~myron/matlab/cpd/.
The code of SC method are available at: https://vision.cornell.edu/se3/publications/.
We implemented $L_2E$(GSM) based on the publicly available code at http://www.escience.cn/people/jiayima/cxdm.html (Ma et al. 2013b).
The datasets are available at: http://rgbd-dataset.cs.washington.edu/.
The code of SC method are available at: https://vision.cornell.edu/se3/publications/.

References

Aydin, V. A., & Foroosh, H. (2017). A linear well-posed solution to recover high-frequency information for super resolution image reconstruction. Multidimensional Systems and Signal Processing, 2, 1–22.
Google Scholar
Belongie, S., Malik, J., & Puzicha, J. (2002). Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(24), 509–522.
Google Scholar
Bhandari, A. K., Kumar, A., Singh, G. K., & Soni, V. (2016). Dark satellite image enhancement using knee transfer function and gamma correction based on DWT–SVD. Multidimensional Systems and Signal Processing, 27(2), 453–476.
Google Scholar
Carmeli, C., Vito, E. D., & Toigo, A. (2006). Vector valued reproducing kernel hilbert spaces of integrable functions and mercer theorem. Analysis and Applications, 10(4), 377–408.
MathSciNet MATH Google Scholar
Chen, G., & Coulombe, S. (2014). A new image registration method robust to noise. Multidimensional Systems and Signal Processing, 25(3), 601–609.
MathSciNet Google Scholar
Dong, H., Figueroa, N., & Saddik, A. E. (2014). Towards consistent reconstructions of indoor spaces based on 6d rgb-d odometry and kinectfusion. In Proceedings of the IEEE/RSJ international conference on intelligent robots and systems (pp. 1796–1803)
Fischler, M. A., & Bolles, R. C. (1981). Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography. Communications of the ACM, 24(6), 381–395.
MathSciNet Google Scholar
Gao, Y., & Yuille, A. L. (2016). Exploiting symmetry and/or manhattan properties for 3D object structure estimation from single and multiple images. arXiv:1607.07129.
Gao, Y., Ma, J., & Yuille, A. L. (2017). Semi-supervised sparse representation based classification for face recognition with insufficient labeled samples. IEEE Transactions on Image Processing, 26(5), 2545–2560.
MathSciNet MATH Google Scholar
Han, J., & Bhanu, B. (2007). Fusion of color and infrared video for moving human detection. Pattern Recognition, 40(6), 1771–1784.
MATH Google Scholar
Han, J., Farin, D., & With, P. H. N. D. (2011). A mixed-reality system for broadcasting sports video to mobile devices. IEEE Multimedia, 18(2), 72–84.
Google Scholar
Han, J., Pauwels, E., & Zeeuw, P. D. (2012). Visible and Infrared Image Registration Employing Line-Based Geometric Analysis. Berlin: Springer.
Google Scholar
Han, J., Pauwels, E. J., & De Zeeuw, P. (2013a). Visible and infrared image registration in man-made environments employing hybrid visual features. Pattern Recognition Letters, 34(1), 42–51.
Google Scholar
Han, J., Shao, L., Xu, D., & Shotton, J. (2013b). Enhanced computer vision with microsoft kinect sensor: A review. IEEE Transactions on Cybernetics, 43(5), 1318–1334.
Google Scholar
Herrera, C. D., Kannala, C. J., & Heikkilä, J. (2011). Accurate and practical calibration of a depth and color camera pair. In Proceedings of the international conference computer analysis of images and patterns (pp. 437–445)
Herrera, C. D., Kannala, C. J., & Heikkilä, J. (2012). Joint depth and color camera calibration with distortion correction. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(10), 2058–2064.
Google Scholar
Huang, X., & Zhe, C. (2002). A wavelet-based multisensor image registration algorithm. In International conference on signal processing (vol.1, pp. 773–776)
Jiang, J., Chen, C., Ma, J., Wang, Z., Wang, Z., & Hu, R. (2017a). Srlsp: A face image super-resolution algorithm using smooth regression with local structure prior. IEEE Transactions on Multimedia, 19(1), 27–40.
Google Scholar
Jiang, J., Ma, J., Chen, C., Jiang, X., & Wang, Z. (2017b). Noise robust face image super-resolution through smooth sparse representation. IEEE Transactions on Cybernetics, 47(11), 3991–4002.
Google Scholar
Jiang, J., Ma, X., Chen, C., Lu, T., Wang, Z., & Ma, J. (2017c). Single image super-resolution via locally regularized anchored neighborhood regression and nonlocal means. IEEE Transactions on Multimedia, 19(1), 15–26.
Google Scholar
Liu, S., Shi, M., Zhu, Z., & Zhao, J. (2017). Image fusion based on complex-shearlet domain with guided filtering. Multidimensional Systems and Signal Processing, 28(1), 207–224.
MATH Google Scholar
Lu, T., Guan, Y., Zhang, Y., Qu, S., & Xiong, Z. (2017a). Robust and efficient face recognition via low-rank supported extreme learning machine. Multimedia Tools and Applications, 12, 1–22.
Google Scholar
Lu, T., Xiong, Z., Zhang, Y., Wang, B., & Lu, T. (2017b). Robust face super-resolution via locality-constrained low-rank representation. IEEE Access, 5(99), 13,103–13,117.
Google Scholar
Ma, J., Zhao, J., Tian, J., Bai, X., & Tu, Z. (2013a). Regularized vector field learning with sparse approximation for mismatch removal. Pattern Recognition, 46(12), 3519–3532.
MATH Google Scholar
Ma, J., Zhao, J., Tian, J., Tu, Z., & Yuille, A. L. (2013b). Robust estimation of nonrigid transformation for point set registration. In Proceedings of the IEEE conference computer vision and pattern recognition (pp. 2147–2154)
Ma, J., Zhao, J., Tian, J., Yuille, A. L., & Tu, Z. (2014). Robust point matching via vector field consensus. IEEE Transactions on Image Processing, 23(4), 1706–1721.
MathSciNet MATH Google Scholar
Ma, J., Qiu, W., Zhao, J., Ma, Y., Yuille, A. L., & Tu, Z. (2015a). Robust $L_2E$ estimation of transformation for non-rigid registration. IEEE Transactions on Signal Processing, 63(5), 1115–1129.
MathSciNet MATH Google Scholar
Ma, J., Zhao, J., Ma, Y., & Tian, J. (2015b). Non-rigid visible and infrared face registration via regularized gaussian fields criterion. Pattern Recognition, 48(3), 772–784.
Google Scholar
Ma, J., Zhou, H., Zhao, J., Gao, Y., Jiang, J., & Tian, J. (2015c). Robust feature matching for remote sensing image registration via locally linear transforming. IEEE Transactions on Geoscience and Remote Sensing, 53(12), 6469–6481.
Google Scholar
Ma, J., Jiang, J., Liu, C., & Li, Y. (2017a). Feature guided gaussian mixture model with semi-supervised em and local geometric constraint for retinal image registration. Information Sciences, 417, 128–142.
MathSciNet Google Scholar
Ma, J., Zhao, J., Guo, H., Jiang, J., Zhou, H., & Gao, Y. (2017b). Locality preserving matching. In Proceedings of the international joint conference on artificial intelligence (pp. 4492–4498).
Ma, J., Jiang, J., Zhou, H., Zhao, J., & Guo, X. (2018). Guided locality preserving feature matching for remote sensing image registration. In IEEE transactions on geoscience and remote sensing.
Ma, J., Ma, Y., & Li, C. (2019). Infrared and visible image fusion methods and applications: A survey. Information Fusion, 45, 153–178.
Google Scholar
Micchelli, C. A., & Pontil, M. (2005). On learning vector-valued functions. Neural Computation, 17(1), 177–204.
MathSciNet MATH Google Scholar
Myronenko, A., & Song, X. (2010). Point set registration: Coherent point drift. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(12), 2262–2275.
Google Scholar
Peng, L., Zhang, Y., Zhou, Huabing, & Lu, T. (2018). A robust method for estimating image geometry with local structure constraint. IEEE Access, 6, 20734–20747.
Google Scholar
Poggio, T., Torre, V., & Koch, C. (1985). Computational vision and regularization theory. Nature, 317(6035), 638–643.
Google Scholar
Qian, Q., & Gunturk, B. K. (2016). Extending depth of field and dynamic range from differently focused and exposed images. Multidimensional Systems and Signal Processing, 27(2), 493–509.
Google Scholar
Raposo, C., Barretov, J. P., & Nunes, U. (2013). Fast and accurate calibration of a kinect sensor. In Proceedings of the international conference on 3D Vision (pp. 342–349).
Scott, D. W. (2001). Parametric statistical modeling by minimum integrated square error. Technometrics, 43(3), 274–285.
MathSciNet Google Scholar
Sevcenco, I. S., Hampton, P. J., & Agathoklis, P. (2015). A wavelet based method for image reconstruction from gradient data with applications. Multidimensional Systems and Signal Processing, 26(3), 717–737.
MathSciNet Google Scholar
Smisek, J., Jancosek, M., & Pajdla, T. (2013). 3d with kinect. In J. Smisek, M. Jancosek, & T. Pajdla (Eds.), Consumer depth cameras for computer vision (pp. 3–25). Berlin: Springer.
Google Scholar
Sun, S., Liu, R., Yang, C., Zhou, H., Zhao, J., & Ma, J. (2016). Comparative study on the speckle filters for the very high-resolution polarimetric synthetic aperture radar imagery. Journal of Applied Remote Sensing, 10(4), 045,014–045,014.
Google Scholar
Tong, J., Zhou, J., Liu, L., Pan, Z., & Yan, H. (2012). Scanning 3d full humanbodies using kinect. IEEE Transactions on Visualization and Computer Graphics, 18(4), 643–650.
Google Scholar
Wand, M. P., & Jones, M. C. (1994). Kernel smoothing. Biometrics, 54, 674.
Google Scholar
Wang, G., Wang, Z., Zhao, W., & Zhou, Q. (2014). Robust Point Matching Using Mixture of Asymmetric Gaussians for Nonrigid Transformation. Berlin: Springer.
Google Scholar
Wang, Q., Li, J., Sullivan, G. J., & Sun, M. T. (2011). Reduced-complexity search for video coding geometry partitions using texture and depth data. In Proceedings of the IEEE visual communications and image processing (pp. 1–4).
Wang, Q., Sun, M. T., Sullivan, G. J., & Li, J. (2012). Complexity-reduced geometry partition search and high efficiency prediction for video coding. In Proceedings of the IEEE international symposium on circuits and systems (pp. 133–136)
Wei, Y., You, X., & Li, H. (2016). Multiscale patch-based contrast measure for small infrared target detection. Pattern Recognition, 58, 216–226.
Google Scholar
Wei, Y., Zhou, Y., & Li, H. (2017). Spectral-spatial response for hyperspectral image classification. Remote Sensing, 9(3), 203.
Google Scholar
Wu, S., He, X., Lu, H., & Yuille, A. L. (2010). A unified model of short-range and long-range motion perception. In Advances in neural information processing systems (pp. 2478–2486).
Yang, C., Zhou, H., Sun, S., Liu, R., Zhao, J., & Ma, J. (2014). Good match exploration for infrared face recognition. Infrared Physics and Technology, 67, 111–115.
Google Scholar
Yang, L., Zhang, L., & Dong, H. (2015). Evaluating and improving the depth accuracy of kinect for windows v2. Sensors, 15(8), 4275–4285.
Google Scholar
Yu, Z., Zhou, H., & Li, C. (2017). Fast non-rigid image feature matching for agricultural uav via probabilistic inference with regularization techniques. Computers and Electronics in Agriculture, 143, 79–89.
Google Scholar
Yuille, A., & Ullman, S. (1987). Rigidity and smoothness of motion. Cambridge: Massachusetts Institute of Technology.
Google Scholar
Zhang, C., & Zhang, Z. (2011). Calibration between depth and color sensors for commodity depth cameras. In Proceedings of the IEEE conference on computer vision and pattern recognition.
Zhao, J., Ma, J., Tian, J., Ma, J., & Zhang, D. (2011). A robust method for vector field learning with application to mismatch removing. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 2977–2984).
Zhou, H., Zhang, D., Chen, C., & Tian, J. (2011). Discarding wide baseline mismatches with global and local transformation consistency. Electronics Letters, 47(1), 25–26.
Google Scholar
Zhou, H., Ma, J., Yang, C., Sun, S., Liu, R., & Zhao, J. (2016). Non-rigid feature matching for remote sensing images via probabilistic inference with global and local regularizations. IEEE Geoscience and Remote Sensing Letters, 13(3), 374–378.
Google Scholar
Zhou, Y., & Wei, Y. (2016). Learning hierarchical spectral-spatial features for hyperspectral image classification. IEEE Transactions on Cybernetics, 46(7), 1667–1678.
Google Scholar

Download references

Author information

Authors and Affiliations

Huazhong University of Science and Technology, Wuhan, 430074, China
Li Peng
Hubei Key Laboratory of Intelligent Robot, Wuhan Institute of Technology, Wuhan, China
Li Peng, Yanduo Zhang & Huabing Zhou
Hubei Radio and TV University, Wuhan, China
Li Peng
School of Computer Science and Technology, Harbin Institute of Technology, Harbin, 150001, China
Junjun Jiang
Electronic Information School, Wuhan University, Wuhan, 4320072, China
Jiayi Ma

Authors

Li Peng
View author publications
You can also search for this author in PubMed Google Scholar
Yanduo Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Huabing Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Junjun Jiang
View author publications
You can also search for this author in PubMed Google Scholar
Jiayi Ma
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yanduo Zhang.

Additional information

This work was supported by the National Natural Science Foundation of China (Nos. 41501505, 61773295, 61503288, 61501413, 61502354).

Appendix

1.1 Proof of Eq. (3)

We have $F(\epsilon ) = \sum _{k = 1}^{4}w_k \phi (\epsilon |0,\sigma _{k}^2)$, where $\phi (\epsilon |0,\sigma _{k}^2) =\phi (\epsilon |0,\sigma _k^2) = \frac{1}{\sqrt{2\pi }\sigma _k}e^{(-\frac{\epsilon ^2}{2\sigma _k^2})}$, and $w_k$ denotes the possibility of a match point belonging to a match function. Suppose $\phi _1 = \phi (\epsilon |0,\sigma _1^2), \phi _2 = \phi (\epsilon |0,\sigma _2^2), \phi _3 = \phi (\epsilon |0,\sigma _3^2), \phi _4 = \phi (\epsilon |0,\sigma _4^2)$. We assume $M = \int (F(\epsilon ))^2d\epsilon $, and then

$$\begin{aligned} M&=\int \left( \sum _{k = 1}^{4}w_k \phi (\epsilon |0,\sigma _k^2)\right) ^2d\epsilon \nonumber \\&= \sum _{k = 1}^{4}\int w_k^2\phi _k^2d\epsilon + 2\sum _{i,j=1,i\ne j}^{4}w_iw_j\int \phi _i\phi _jd\epsilon \nonumber \\&=\sum _{k=1}^4\frac{w_k^2}{2^d(\pi \sigma _k)^{\frac{d}{2}}} + 2\sum _{i,j=1,i\ne j}^{4}w_iw_j\int \phi _i\phi _jd\epsilon . \end{aligned}$$

(12)

According to Wand and Jones (1994), we can get the following formulation:

$$\begin{aligned} M&=\sum _{k=1}^4\frac{w_k^2}{2^d(\pi \sigma _k)^{\frac{d}{2}}} +2\sum _{i,j=1,i\ne j}^{4}w_iw_j \phi (0|0,\sigma _i^2+\sigma _j^2).\nonumber \\&=\sum _{k=1}^4\frac{w_k^2}{2^d(\pi \sigma _k)^{\frac{d}{2}}} + A. \end{aligned}$$

(13)

where A is a constant, and we can omit it. Then we can get the following formulation:

$$\begin{aligned} \int (F(\epsilon ))^2d\epsilon&=\sum _{k=1}^4\frac{w_k^2}{2^d(\pi \sigma _k)^{\frac{d}{2}}}. \end{aligned}$$

(14)

1.2 Vector-valued reproducing Kernel Hilbert space

We review the basic theory of vector-valued reproducing kernel Hilbert space here, and for further details and references please see Carmeli et al. (2006), Micchelli and Pontil (2005).

Let X be a set, i.e.$X \subseteq {\mathbb {R}}^P $, Y a real Hilbert space with the inner product (norm) $\langle \cdot ,\cdot \rangle ,(\Vert \cdot \Vert )$, i.e.$y \subseteq {\mathbb {R}}^D$, and H a Hilbert space with the inner product (norm) $\langle \cdot ,\cdot \rangle _H,(\Vert \cdot \Vert _H)$, where $P=D=$2 or 3 for point matching problem. Note that a norm can be induced by an inner product, i.e.$\forall f \in H,\Vert f\Vert _H = \root \of {\langle f,f \rangle _H}$. And a Hilbert space is a real or complex inner product space that is also a complete metric space with respect to the distance function induced by the inner product. Thus a vector-valued RKHS can be defined as follows.

Definition 1

A Hilbert space H is an RKHS if the evaluation maps $ ev_x:H \rightarrow Y(i.e.,ev_x(f)= f(x))$ are bounded; i.e.$\forall x \in X$ and there exists a positive constant $C_x$ such that

$$\begin{aligned} \Vert ev_x(f)\Vert = \Vert f(x)\Vert \le C_x\Vert f\Vert _H, \quad \forall f \in H. \end{aligned}$$

(15)

A reproducing kernel $\varGamma :X \times X \rightarrow B(Y)$ is then defined as $\varGamma (x,x'):=ev_xev_{x'}^*$, where B(Y) is the Banach space of bounded linear operators (i.e., $\varGamma (x,x'),\forall x,x' \in X)$ on Y, i.e.$ B(Y) \subseteq {\mathbb {R}}^{D\times D}$, and $ev_x^*$ is the adjoint of $ev_x$. We have the following two properties about the RKHS and kernel.

Remark 1

The kernel $\varGamma $ reproduces the value of a function $f\in H$ at a point $x \in X$. Indeed, given $\forall x \in X$ and $y \in Y$, we have $ev_x^*y=\varGamma (\cdot ,x)y$, so that $\langle f(x),y \rangle = \langle f,\varGamma (\cdot ,x)y\rangle _H$.

Remark 2

An RKHS defines a corresponding reproducing kernel. Conversely, a reproducing kernel defines a unique RKHS.

More specifically, for any $M \in {\mathbb {N}}, \lbrace x_i \rbrace _{i=1}^M \subseteq X$, and a reproducing kernel $\varGamma $, a unique RKHS can be defined by considering the completion of the space:

$$\begin{aligned} H_M=\left\{ \sum _{i=1}^M\varGamma (\cdot ,x_i)c_i:c_i \in Y\right\} , \end{aligned}$$

(16)

with respect to the norm induced by the inner product

$$\begin{aligned} \langle f,g \rangle _H=\sum _{i,j=1}^M \langle \varGamma (x_j,x_i)c_i,d_j\rangle , \quad \forall f,g \in H_M, \end{aligned}$$

(17)

where $f = \sum _{i=1}^M \varGamma (\cdot ,x_i)c_i$ and $g = \sum _{j=1}^M \varGamma (\cdot ,x_j)d_j$.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Peng, L., Zhang, Y., Zhou, H. et al. A non-parametric depth modification model for registration between color and depth images. Multidim Syst Sign Process 30, 1129–1148 (2019). https://doi.org/10.1007/s11045-018-0599-8

Download citation

Received: 08 January 2018
Revised: 02 May 2018
Accepted: 05 June 2018
Published: 11 June 2018
Issue Date: 01 July 2019
DOI: https://doi.org/10.1007/s11045-018-0599-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A non-parametric depth modification model for registration between color and depth images

Abstract

Access this article

Similar content being viewed by others

Color and Depth Image Correspondence for Kinect v2

Color and Depth Image Correspondence for Kinect v2

A new depth image quality metric using a pair of color and depth images

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendix

1.1 Proof of Eq. (3)

1.2 Vector-valued reproducing Kernel Hilbert space

Definition 1

Remark 1

Remark 2

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A non-parametric depth modification model for registration between color and depth images

Abstract

Access this article

Similar content being viewed by others

Color and Depth Image Correspondence for Kinect v2

Color and Depth Image Correspondence for Kinect v2

A new depth image quality metric using a pair of color and depth images

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendix

Appendix

1.1 Proof of Eq. (3)

1.2 Vector-valued reproducing Kernel Hilbert space

Definition 1

Remark 1

Remark 2

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation