Skip to main content
Log in

Local tangent space alignment based on Hilbert–Schmidt independence criterion regularization

  • Short paper
  • Published:
Pattern Analysis and Applications Aims and scope Submit manuscript

Abstract

Local tangent space alignment (LTSA) is a famous manifold learning algorithm, and many other manifold learning algorithms are developed based on LTSA. However, from the viewpoint of dimensionality reduction, LTSA is only a local feature preserving algorithm. What the community of dimensionality reduction is now pursuing are those algorithms capable of preserving both local and global features at the same time. In this paper, a new algorithm for dimensionality reduction, called HSIC-regularized LTSA (HSIC–LTSA), is proposed, in which a HSIC regularization term is added to the objective function of LTSA. HSIC is an acronym for Hilbert–Schmidt independence criterion and has been used in many applications of machine learning. However, HSIC has not been directly applied to dimensionality reduction so far, neither used as a regularization term to combine with other machine learning algorithms. Therefore, the proposed HSIC–LTSA is a new try for both HSIC and LTSA. In HSIC–LTSA, HSIC makes the high- and low-dimensional data statistically correlative as much as possible, while LTSA reduces the data dimension under the local homeomorphism-preserving criterion. The experimental results presented in this paper show that, on several commonly used datasets, HSIC–LTSA performs better than LTSA as well as some state-of-the-art local and global preserving algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

References

  1. van der Maaten LJP, Postma EO, van der Herik HJ (2007) Dimensionality reduction: a comparative review. J Mach Learn Res 10(1):66–71

    Google Scholar 

  2. Tenenbaum JB, De Silva V, Langford JC (2000) A global geometric framework for nonlinear dimensionality reduction. Science 290(5500):2319–2323

    Article  Google Scholar 

  3. Schölkopf B, Smola A, Müller K-R (1998) Nonlinear component analysis as a kernel eigenvalue problem. Neural Comput 10(5):1299–1319

    Article  Google Scholar 

  4. Weinberger KQ, Sha F, Saul LK (2004) Learning a kernel matrix for nonlinear dimensionality reduction. In: Proceedings of the twenty-first international conference on Machine learning. ACM

  5. Lafon S, Lee AB (2006) Diffusion maps and coarse-graining: a unified framework for dimensionality reduction, graph partitioning, and data set parameterization. IEEE Trans Pattern Anal Mach Intell 28(9):1393–1403

    Article  Google Scholar 

  6. Zhang Z, Zha H (2004) Principal manifolds and nonlinear dimensionality reduction via tangent space alignment. SIAM J Sci Comput 26(1):313–338

    Article  MathSciNet  MATH  Google Scholar 

  7. He X, Niyogi P (2003) Locality preserving projections. Adv Neural Inf Process Syst 16(1):186–197

    Google Scholar 

  8. Chen J, Ma Z, Liu Y (2013) Local coordinates alignment with global preservation for dimensionality reduction. IEEE Trans Neural Netw Learn Syst 24(1):106–117

    Article  Google Scholar 

  9. Liu X et al (2014) Global and local structure preservation for feature selection. IEEE Trans Neural Netw Learn Syst 25(6):1083–1095

    Article  Google Scholar 

  10. Gretton A et al (2005) Measuring statistical dependence with Hilbert–Schmidt norms. In: International conference on algorithmic learning theory. Springer, Berlin

  11. Yan K, Kou L, Zhang D (2017) Learning domain-invariant subspace using domain features and independence maximization. IEEE Trans Cybern 48:288–299

    Article  Google Scholar 

  12. Damodaran BB, Courty N, Lefèvre S (2017) Sparse Hilbert Schmidt independence criterion and surrogate-kernel-based feature selection for hyperspectral image classification. IEEE Trans Geosci Remote Sens 55(4):2385–2398

    Article  Google Scholar 

  13. Gangeh MJ, Zarkoob H, Ghodsi A (2017) Fast and scalable feature selection for gene expression data using Hilbert–Schmidt independence criterion. IEEE ACM Trans Comput Biol Bioinform 14(1):167–181

    Article  Google Scholar 

  14. Xiao M, Guo Y (2015) Feature space independent semi-supervised domain adaptation via kernel matching. IEEE Trans Pattern Anal Mach Intell 37(1):54–66

    Article  Google Scholar 

  15. Zhong W et al (2010) Incorporating the loss function into discriminative clustering of structured outputs. IEEE Trans Neural Netw 21(10):1564–1575

    Article  Google Scholar 

  16. Boothby WM (2007) An introduction to differentiable manifolds and Riemannian geometry. Elsevier (Singapore) Pte Ltd., Singapore

    MATH  Google Scholar 

  17. Roweis ST, Saul LK (2000) Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500):2323–2326

    Article  Google Scholar 

  18. Donoho DL, Grimes C (2003) Hessian eigenmaps: locally linear embedding techniques for high-dimensional data. Proc Natl Acad Sci 100(10):5591–5596

    Article  MathSciNet  MATH  Google Scholar 

  19. Belkin M, Niyogi P (2001) Laplacian eigenmaps and spectral techniques for embedding and clustering. Adv Neural Inf Process Syst 14(6):585–591

    Google Scholar 

  20. He X, Yan S, Hu Y, Niyogi P, Zhang H (2005) Face recognition using Laplacianfaces. IEEE Trans Pattern Anal Mach Intell 27(3):328–340

    Article  Google Scholar 

  21. Pang Y, Zhang L, Liu Z, Yu N, Li H (2005) Neighborhood preserving projections (NPP): a novel linear dimension reduction method. Proc ICIC Pattern Anal Mach Intell 1:117–125

    Google Scholar 

  22. Cai D, He X, Han J, Zhang H (2006) Orthogonal Laplacianfaces for face recognition. IEEE Trans Image Process 15(11):3608–3614

    Article  Google Scholar 

  23. Kokiopoulou E, Saad Y (2007) Orthogonal neighborhood preserving projections: a projection-based dimensionality reduction technique. IEEE Trans Pattern Anal Mach Intell 29(12):2143–2156

    Article  Google Scholar 

  24. Yan S, Xu D, Zhang B, Zhang H, Yang Q, Lin S (2007) Graph embedding and extensions: a general framework for dimensionality reduction. IEEE Trans Pattern Anal Mach Intell 29(1):40–51

    Article  Google Scholar 

  25. Saul LK, Roweis ST (2003) Think globally, fit locally: unsupervised learning of low dimensional manifold. J Mach Learn Res 4(1):119–155

    MathSciNet  MATH  Google Scholar 

  26. Qiao H, Zhang P, Wang D, Zhang B (2013) An explicit nonlinear mapping for manifold learning. IEEE Trans Cybern 43(1):51–63

    Article  Google Scholar 

  27. Belkin M, Niyogi P, Sindhwani V (2006) Manifold regularization: a geometric framework for learning from labeled and unlabeled examples. J Mach Learn Res 7(1):2399–2434

    MathSciNet  MATH  Google Scholar 

  28. Jost J (2008) Riemannian geometry and geometric analysis. Springer, Berlin

    MATH  Google Scholar 

  29. Spivak M (1981) A comprehensive introduction to differential geometry. In: American Mathematical Monthly, vol 4

  30. Kreyszig E (1981) Introductory functional analysis with applications, New York, 1

  31. Mika S et al (1999) Fisher discriminant analysis with kernels. In: Neural networks for signal processing IX

  32. Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297

    MATH  Google Scholar 

  33. Shawe-Taylor J, Cristianini N (2004) Kernel methods for pattern analysis. Cambridge University Press, Cambridge

    Book  MATH  Google Scholar 

  34. Gohberg I, Goldberg S, Kaashoek MA (1990) Hilbert–Schmidt operators. In: Classes of linear operators, vol 1. Birkhäuser, Basel, pp 138–147

  35. Xiang S et al (2011) Regression reformulations of LLE and LTSA with locally linear transformation. IEEE Trans Syst Man Cybern Part B (Cybern) 41(5):1250–1262

    Article  Google Scholar 

  36. Martin Sagayam K, Jude Hemanth D (2018) ABC algorithm based optimization of 1-D hidden Markov model for hand gesture recognition application. Comput Ind 99:313–323

    Article  Google Scholar 

  37. Sagayam KM, Hemanth DJ, Ramprasad YN, Menon R (2018) Optimization of hand motion recognition system based on 2D HMM approach using ABC algorithm. In: Hybrid intelligent techniques for pattern analysis and understanding. Chapman and Hall, New York

  38. Sagayam KM, Hemanth DJ (2018) Comparative analysis of 1-D HMM and 2-D HMM for hand motion recognition applications. In: Progress in intelligent computing techniques: theory, practice, and applications, advances in intelligent systems and computing. Springer, p 518

  39. Gangeh MJ, Ghodsi A, Kamel MS (2013) Kernelized supervised dictionary learning. IEEE Trans Signal Process 61(19):4753–4767

    Article  MathSciNet  MATH  Google Scholar 

  40. Gangeh MJ, Fewzee P, Ghodsi A, Kamel MS, Karray F (2014) Multiview supervised dictionary learning in speech emotion recognition. IEEE ACM Trans Audio Speech Lang Process 22(6):1056–1068

    Article  Google Scholar 

  41. Barshan E et al (2011) Supervised PCA visualization classification and regression on subspaces and submanifolds. Pattern Recognit 44:1357–1371

    Article  MATH  Google Scholar 

Download references

Acknowledgements

We would like to express our sincere appreciation to the anonymous reviewers for their insightful comments, which have greatly aided us in improving the quality of the paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xinghua Zheng.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix: Reproducing kernel Hilbert Spaces (RKHS)

Appendix: Reproducing kernel Hilbert Spaces (RKHS)

HSIC is based on RKHS. Let \({{{ L}}^{{ 2}}}\left( \varOmega \right) =\left\{ f\left| f:\varOmega \rightarrow R,\int \limits _{\varOmega }{{{\left| f\left( x \right) \right| }^{2}}}<+\infty \right. \right\}\) be the space of square integrable functions. An inner product \(\left\langle \bullet ,\bullet \right\rangle\) can be defined over \({{{ L}}^{{ 2}}}\left( \varOmega \right)\) [30]:

$$\begin{aligned} \left\langle f,g \right\rangle =\int \limits _{\varOmega }{f\left( x \right) g\left( x \right) {\mathrm{d}}x} \end{aligned}$$
(52)

It can be proven that \(H=\left( {{{ L}}^{{ 2}}}\left( \varOmega \right) ,\left\langle \bullet ,\bullet \right\rangle \right)\) is a complete inner product space, i.e., Hilbert space.

Definition

[30] Let \(H=\left( {{{ L}}^{{ 2}}}\left( \varOmega \right) ,\left\langle \bullet ,\bullet \right\rangle \right)\); if there is a function \(k:\varOmega \times \varOmega \rightarrow R\) such that

  • For all \(x\in \varOmega\),\({{k}_{x}}=k\left( \bullet ,x \right) \in H\);

  • For all \(f\in H\), \(f\left( x \right) =\left\langle f,k\left( \bullet ,x \right) \right\rangle\)

then H is called a reproducing kernel Hilbert space (RKHS) and k called the reproducing kernel of H.

The reproducing kernel k can used to define a map: \(\varphi :\varOmega \rightarrow H\) such that for all \(x\in \varOmega\),

$$\begin{aligned} \varphi \left( x \right) =k\left( \bullet ,x \right) \in H \end{aligned}$$
(53)

It can be easily proven that

$$\begin{aligned} \left\langle \varphi \left( x \right) ,\varphi \left( y \right) \right\rangle =\left\langle {{k}_{x}},k\left( \bullet ,y \right) \right\rangle ={{k}_{x}}\left( y \right) =k\left( y,x \right) =k\left( x,y \right) \end{aligned}$$
(54)

The above equation is often used in many kernel methods of machine learning such as kPCA [3], kLDA [31], kSVM [32], etc.

Furthermore, if X is a random variable on \(\varOmega\), then \(\varphi \left( X \right)\) is a random process and its mean function is defined

$$\begin{aligned} {{\mu }_{{ X}}}\left( { u} \right)& = {{{ E}}_{{ X}}}\left[ \varphi \left( { X} \right) \left( { u} \right) \right] = {{{ E}}_{{ X}}}\left[ { k}\left( { u,X} \right) \right] \\& = \int \limits _{\varOmega }{{ k}\left( { u,x} \right) {{p}_{X}}\left( x \right) {\mathrm{d}}x} \end{aligned}$$
(55)

Then, for all \(f\in H\),

$$\begin{aligned} \left\langle {{\mu }_{{ X}}},f \right\rangle&= \int \limits _{\varOmega }{{{\mu }_{X}}\left( u \right) { f}\left( { u} \right) {\mathrm{d}}u} \\&=\int \limits _{\varOmega }{\left( \int \limits _{\varOmega }{k\left( u,x \right) {{p}_{X}}\left( x \right) {\mathrm{d}}x} \right) { f}\left( { u} \right) {\mathrm{d}}u} \\&=\int \limits _{\varOmega }{\left( \int \limits _{\varOmega }{k\left( u,x \right) { f}\left( { u} \right) {\mathrm{d}}u} \right) {{p}_{X}}\left( x \right) {\mathrm{d}}x} \\&=\int \limits _{\varOmega }{\left\langle f,k\left( \bullet ,x \right) \right\rangle {{p}_{X}}\left( x \right) {\mathrm{d}}x} \\&=\int \limits _{\varOmega }{f\left( { x} \right) {{p}_{x}}\left( x \right) {\mathrm{d}}u}={{E}_{x}} \left[ { f}\left( X \right) \right] \end{aligned}$$
(56)

In mathematics, it can be proven that RKHS can be generated from kernel functions. The definition of kernel functions is as follows:

Definition

[33] Let \(k:\varOmega \times \varOmega \rightarrow R\), if k satisfies the following conditions:

  • Symmetric: for all \(x,y\in \varOmega\), \(k\left( x,y \right) =k\left( y,x \right)\)

  • Square integrable: for all \(x\in \varOmega\), \({{k}_{x}}=k\left( \bullet ,x \right)\) is square integrable

  • Positive definite: for all \({{x}_{1}},\ldots ,{{x}_{N}}\in \varOmega\), the matrix \(\left[ \begin{array}{lll} k\left( {{x}_{1}},{{x}_{1}} \right) &{} \ldots &{} k\left( {{x}_{1}},{{x}_{N}} \right) \\ \vdots &{} \ddots &{} \vdots \\ k\left( {{x}_{N}},{{x}_{1}} \right) &{} \ldots &{} k\left( {{x}_{N}},{{x}_{N}} \right) \\ \end{array} \right]\) is positive definite then k is called a kernel function.

Remark

Kernel functions and reproducing kernels are not the same concept. Kernel functions are defined on their own, while reproducing kernels are defined based on RKHS.

Theorem

[30] A kernel function k can be used to generate a unique RHHS\({{H}_{k}}\)such that k becomes the reproducing kernel of\({{H}_{k}}.\)

Based on this theorem, as long as a kernel function is determined, a RKHS is also determined too.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zheng, X., Ma, Z. & Li, L. Local tangent space alignment based on Hilbert–Schmidt independence criterion regularization. Pattern Anal Applic 23, 855–868 (2020). https://doi.org/10.1007/s10044-019-00810-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10044-019-00810-6

Keywords

Navigation