Local tangent space alignment based on Hilbert–Schmidt independence criterion regularization

Zheng, Xinghua; Ma, Zhengming; Li, Lei

doi:10.1007/s10044-019-00810-6

Local tangent space alignment based on Hilbert–Schmidt independence criterion regularization

Short paper
Published: 01 April 2019

Volume 23, pages 855–868, (2020)
Cite this article

Pattern Analysis and Applications Aims and scope Submit manuscript

Xinghua Zheng¹,
Zhengming Ma¹ &
Lei Li¹

246 Accesses
4 Citations
Explore all metrics

Abstract

Local tangent space alignment (LTSA) is a famous manifold learning algorithm, and many other manifold learning algorithms are developed based on LTSA. However, from the viewpoint of dimensionality reduction, LTSA is only a local feature preserving algorithm. What the community of dimensionality reduction is now pursuing are those algorithms capable of preserving both local and global features at the same time. In this paper, a new algorithm for dimensionality reduction, called HSIC-regularized LTSA (HSIC–LTSA), is proposed, in which a HSIC regularization term is added to the objective function of LTSA. HSIC is an acronym for Hilbert–Schmidt independence criterion and has been used in many applications of machine learning. However, HSIC has not been directly applied to dimensionality reduction so far, neither used as a regularization term to combine with other machine learning algorithms. Therefore, the proposed HSIC–LTSA is a new try for both HSIC and LTSA. In HSIC–LTSA, HSIC makes the high- and low-dimensional data statistically correlative as much as possible, while LTSA reduces the data dimension under the local homeomorphism-preserving criterion. The experimental results presented in this paper show that, on several commonly used datasets, HSIC–LTSA performs better than LTSA as well as some state-of-the-art local and global preserving algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

References

van der Maaten LJP, Postma EO, van der Herik HJ (2007) Dimensionality reduction: a comparative review. J Mach Learn Res 10(1):66–71
Google Scholar
Tenenbaum JB, De Silva V, Langford JC (2000) A global geometric framework for nonlinear dimensionality reduction. Science 290(5500):2319–2323
Article Google Scholar
Schölkopf B, Smola A, Müller K-R (1998) Nonlinear component analysis as a kernel eigenvalue problem. Neural Comput 10(5):1299–1319
Article Google Scholar
Weinberger KQ, Sha F, Saul LK (2004) Learning a kernel matrix for nonlinear dimensionality reduction. In: Proceedings of the twenty-first international conference on Machine learning. ACM
Lafon S, Lee AB (2006) Diffusion maps and coarse-graining: a unified framework for dimensionality reduction, graph partitioning, and data set parameterization. IEEE Trans Pattern Anal Mach Intell 28(9):1393–1403
Article Google Scholar
Zhang Z, Zha H (2004) Principal manifolds and nonlinear dimensionality reduction via tangent space alignment. SIAM J Sci Comput 26(1):313–338
Article MathSciNet MATH Google Scholar
He X, Niyogi P (2003) Locality preserving projections. Adv Neural Inf Process Syst 16(1):186–197
Google Scholar
Chen J, Ma Z, Liu Y (2013) Local coordinates alignment with global preservation for dimensionality reduction. IEEE Trans Neural Netw Learn Syst 24(1):106–117
Article Google Scholar
Liu X et al (2014) Global and local structure preservation for feature selection. IEEE Trans Neural Netw Learn Syst 25(6):1083–1095
Article Google Scholar
Gretton A et al (2005) Measuring statistical dependence with Hilbert–Schmidt norms. In: International conference on algorithmic learning theory. Springer, Berlin
Yan K, Kou L, Zhang D (2017) Learning domain-invariant subspace using domain features and independence maximization. IEEE Trans Cybern 48:288–299
Article Google Scholar
Damodaran BB, Courty N, Lefèvre S (2017) Sparse Hilbert Schmidt independence criterion and surrogate-kernel-based feature selection for hyperspectral image classification. IEEE Trans Geosci Remote Sens 55(4):2385–2398
Article Google Scholar
Gangeh MJ, Zarkoob H, Ghodsi A (2017) Fast and scalable feature selection for gene expression data using Hilbert–Schmidt independence criterion. IEEE ACM Trans Comput Biol Bioinform 14(1):167–181
Article Google Scholar
Xiao M, Guo Y (2015) Feature space independent semi-supervised domain adaptation via kernel matching. IEEE Trans Pattern Anal Mach Intell 37(1):54–66
Article Google Scholar
Zhong W et al (2010) Incorporating the loss function into discriminative clustering of structured outputs. IEEE Trans Neural Netw 21(10):1564–1575
Article Google Scholar
Boothby WM (2007) An introduction to differentiable manifolds and Riemannian geometry. Elsevier (Singapore) Pte Ltd., Singapore
MATH Google Scholar
Roweis ST, Saul LK (2000) Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500):2323–2326
Article Google Scholar
Donoho DL, Grimes C (2003) Hessian eigenmaps: locally linear embedding techniques for high-dimensional data. Proc Natl Acad Sci 100(10):5591–5596
Article MathSciNet MATH Google Scholar
Belkin M, Niyogi P (2001) Laplacian eigenmaps and spectral techniques for embedding and clustering. Adv Neural Inf Process Syst 14(6):585–591
Google Scholar
He X, Yan S, Hu Y, Niyogi P, Zhang H (2005) Face recognition using Laplacianfaces. IEEE Trans Pattern Anal Mach Intell 27(3):328–340
Article Google Scholar
Pang Y, Zhang L, Liu Z, Yu N, Li H (2005) Neighborhood preserving projections (NPP): a novel linear dimension reduction method. Proc ICIC Pattern Anal Mach Intell 1:117–125
Google Scholar
Cai D, He X, Han J, Zhang H (2006) Orthogonal Laplacianfaces for face recognition. IEEE Trans Image Process 15(11):3608–3614
Article Google Scholar
Kokiopoulou E, Saad Y (2007) Orthogonal neighborhood preserving projections: a projection-based dimensionality reduction technique. IEEE Trans Pattern Anal Mach Intell 29(12):2143–2156
Article Google Scholar
Yan S, Xu D, Zhang B, Zhang H, Yang Q, Lin S (2007) Graph embedding and extensions: a general framework for dimensionality reduction. IEEE Trans Pattern Anal Mach Intell 29(1):40–51
Article Google Scholar
Saul LK, Roweis ST (2003) Think globally, fit locally: unsupervised learning of low dimensional manifold. J Mach Learn Res 4(1):119–155
MathSciNet MATH Google Scholar
Qiao H, Zhang P, Wang D, Zhang B (2013) An explicit nonlinear mapping for manifold learning. IEEE Trans Cybern 43(1):51–63
Article Google Scholar
Belkin M, Niyogi P, Sindhwani V (2006) Manifold regularization: a geometric framework for learning from labeled and unlabeled examples. J Mach Learn Res 7(1):2399–2434
MathSciNet MATH Google Scholar
Jost J (2008) Riemannian geometry and geometric analysis. Springer, Berlin
MATH Google Scholar
Spivak M (1981) A comprehensive introduction to differential geometry. In: American Mathematical Monthly, vol 4
Kreyszig E (1981) Introductory functional analysis with applications, New York, 1
Mika S et al (1999) Fisher discriminant analysis with kernels. In: Neural networks for signal processing IX
Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297
MATH Google Scholar
Shawe-Taylor J, Cristianini N (2004) Kernel methods for pattern analysis. Cambridge University Press, Cambridge
Book MATH Google Scholar
Gohberg I, Goldberg S, Kaashoek MA (1990) Hilbert–Schmidt operators. In: Classes of linear operators, vol 1. Birkhäuser, Basel, pp 138–147
Xiang S et al (2011) Regression reformulations of LLE and LTSA with locally linear transformation. IEEE Trans Syst Man Cybern Part B (Cybern) 41(5):1250–1262
Article Google Scholar
Martin Sagayam K, Jude Hemanth D (2018) ABC algorithm based optimization of 1-D hidden Markov model for hand gesture recognition application. Comput Ind 99:313–323
Article Google Scholar
Sagayam KM, Hemanth DJ, Ramprasad YN, Menon R (2018) Optimization of hand motion recognition system based on 2D HMM approach using ABC algorithm. In: Hybrid intelligent techniques for pattern analysis and understanding. Chapman and Hall, New York
Sagayam KM, Hemanth DJ (2018) Comparative analysis of 1-D HMM and 2-D HMM for hand motion recognition applications. In: Progress in intelligent computing techniques: theory, practice, and applications, advances in intelligent systems and computing. Springer, p 518
Gangeh MJ, Ghodsi A, Kamel MS (2013) Kernelized supervised dictionary learning. IEEE Trans Signal Process 61(19):4753–4767
Article MathSciNet MATH Google Scholar
Gangeh MJ, Fewzee P, Ghodsi A, Kamel MS, Karray F (2014) Multiview supervised dictionary learning in speech emotion recognition. IEEE ACM Trans Audio Speech Lang Process 22(6):1056–1068
Article Google Scholar
Barshan E et al (2011) Supervised PCA visualization classification and regression on subspaces and submanifolds. Pattern Recognit 44:1357–1371
Article MATH Google Scholar

Download references

Acknowledgements

We would like to express our sincere appreciation to the anonymous reviewers for their insightful comments, which have greatly aided us in improving the quality of the paper.

Author information

Authors and Affiliations

School of Data and Computer Science, Sun Yat-sen University, Guangzhou, China
Xinghua Zheng, Zhengming Ma & Lei Li

Authors

Xinghua Zheng
View author publications
You can also search for this author in PubMed Google Scholar
Zhengming Ma
View author publications
You can also search for this author in PubMed Google Scholar
Lei Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xinghua Zheng.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix: Reproducing kernel Hilbert Spaces (RKHS)

HSIC is based on RKHS. Let ${{{ L}}^{{ 2}}}\left( \varOmega \right) =\left\{ f\left| f:\varOmega \rightarrow R,\int \limits _{\varOmega }{{{\left| f\left( x \right) \right| }^{2}}}<+\infty \right. \right\}$ be the space of square integrable functions. An inner product $\left\langle \bullet ,\bullet \right\rangle$ can be defined over ${{{ L}}^{{ 2}}}\left( \varOmega \right)$ [30]:

$$\begin{aligned} \left\langle f,g \right\rangle =\int \limits _{\varOmega }{f\left( x \right) g\left( x \right) {\mathrm{d}}x} \end{aligned}$$

(52)

It can be proven that $H=\left( {{{ L}}^{{ 2}}}\left( \varOmega \right) ,\left\langle \bullet ,\bullet \right\rangle \right)$ is a complete inner product space, i.e., Hilbert space.

Definition

[30] Let $H=\left( {{{ L}}^{{ 2}}}\left( \varOmega \right) ,\left\langle \bullet ,\bullet \right\rangle \right)$; if there is a function $k:\varOmega \times \varOmega \rightarrow R$ such that

For all $x\in \varOmega$,${{k}_{x}}=k\left( \bullet ,x \right) \in H$;
For all $f\in H$, $f\left( x \right) =\left\langle f,k\left( \bullet ,x \right) \right\rangle$

then H is called a reproducing kernel Hilbert space (RKHS) and k called the reproducing kernel of H.

The reproducing kernel k can used to define a map: $\varphi :\varOmega \rightarrow H$ such that for all $x\in \varOmega$,

$$\begin{aligned} \varphi \left( x \right) =k\left( \bullet ,x \right) \in H \end{aligned}$$

(53)

It can be easily proven that

$$\begin{aligned} \left\langle \varphi \left( x \right) ,\varphi \left( y \right) \right\rangle =\left\langle {{k}_{x}},k\left( \bullet ,y \right) \right\rangle ={{k}_{x}}\left( y \right) =k\left( y,x \right) =k\left( x,y \right) \end{aligned}$$

(54)

The above equation is often used in many kernel methods of machine learning such as kPCA [3], kLDA [31], kSVM [32], etc.

Furthermore, if X is a random variable on $\varOmega$, then $\varphi \left( X \right)$ is a random process and its mean function is defined

$$\begin{aligned} {{\mu }_{{ X}}}\left( { u} \right)& = {{{ E}}_{{ X}}}\left[ \varphi \left( { X} \right) \left( { u} \right) \right] = {{{ E}}_{{ X}}}\left[ { k}\left( { u,X} \right) \right] \\& = \int \limits _{\varOmega }{{ k}\left( { u,x} \right) {{p}_{X}}\left( x \right) {\mathrm{d}}x} \end{aligned}$$

(55)

Then, for all $f\in H$,

$$\begin{aligned} \left\langle {{\mu }_{{ X}}},f \right\rangle&= \int \limits _{\varOmega }{{{\mu }_{X}}\left( u \right) { f}\left( { u} \right) {\mathrm{d}}u} \\&=\int \limits _{\varOmega }{\left( \int \limits _{\varOmega }{k\left( u,x \right) {{p}_{X}}\left( x \right) {\mathrm{d}}x} \right) { f}\left( { u} \right) {\mathrm{d}}u} \\&=\int \limits _{\varOmega }{\left( \int \limits _{\varOmega }{k\left( u,x \right) { f}\left( { u} \right) {\mathrm{d}}u} \right) {{p}_{X}}\left( x \right) {\mathrm{d}}x} \\&=\int \limits _{\varOmega }{\left\langle f,k\left( \bullet ,x \right) \right\rangle {{p}_{X}}\left( x \right) {\mathrm{d}}x} \\&=\int \limits _{\varOmega }{f\left( { x} \right) {{p}_{x}}\left( x \right) {\mathrm{d}}u}={{E}_{x}} \left[ { f}\left( X \right) \right] \end{aligned}$$

(56)

In mathematics, it can be proven that RKHS can be generated from kernel functions. The definition of kernel functions is as follows:

Definition

[33] Let $k:\varOmega \times \varOmega \rightarrow R$, if k satisfies the following conditions:

Symmetric: for all $x,y\in \varOmega$, $k\left( x,y \right) =k\left( y,x \right)$
Square integrable: for all $x\in \varOmega$, ${{k}_{x}}=k\left( \bullet ,x \right)$ is square integrable
Positive definite: for all ${{x}_{1}},\ldots ,{{x}_{N}}\in \varOmega$, the matrix $\left[ \begin{array}{lll} k\left( {{x}_{1}},{{x}_{1}} \right) &{} \ldots &{} k\left( {{x}_{1}},{{x}_{N}} \right) \\ \vdots &{} \ddots &{} \vdots \\ k\left( {{x}_{N}},{{x}_{1}} \right) &{} \ldots &{} k\left( {{x}_{N}},{{x}_{N}} \right) \\ \end{array} \right]$ is positive definite then k is called a kernel function.

Remark

Kernel functions and reproducing kernels are not the same concept. Kernel functions are defined on their own, while reproducing kernels are defined based on RKHS.

Theorem

[30] A kernel function k can be used to generate a unique RHHS${{H}_{k}}$such that k becomes the reproducing kernel of${{H}_{k}}.$

Based on this theorem, as long as a kernel function is determined, a RKHS is also determined too.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zheng, X., Ma, Z. & Li, L. Local tangent space alignment based on Hilbert–Schmidt independence criterion regularization. Pattern Anal Applic 23, 855–868 (2020). https://doi.org/10.1007/s10044-019-00810-6

Download citation

Received: 21 June 2018
Accepted: 22 March 2019
Published: 01 April 2019
Issue Date: May 2020
DOI: https://doi.org/10.1007/s10044-019-00810-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Local tangent space alignment based on Hilbert–Schmidt independence criterion regularization

Abstract

Access this article

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Appendix: Reproducing kernel Hilbert Spaces (RKHS)

Appendix: Reproducing kernel Hilbert Spaces (RKHS)

Definition

Definition

Remark

Theorem

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation