An improved local tangent space alignment method for manifold learning
Research highlights
► The proposed method is an improvement of the LTSA method. ► A new method for local tangent space approximation is proposed.► It is more accurate than the widely used PCA approximation. ► The proposed method can deal with sparse or non-uniformly distributed data.
Introduction
In many real-world applications such as data visualization and visual tracking, we are often faced with high-dimensional data samples which have only a few intrinsic degrees of freedom. The set of such data samples can be modeled as a data manifold, and algorithms which aim to reduce dimensionality by revealing the manifold structure can be cast into the framework of manifold learning. Traditional algorithms such as principal component analysis (PCA) (Jolliffe, 1999) and multidimensional scaling (MDS) (Cox and Cox, 2001) are successful only when the data manifold is linear. Recently, progress has been made in developing efficient algorithms to be able to learn the low-dimensional structure of nonlinear data manifolds. These proposed methods include isometric feature mapping (ISOMAP) (Tenenbaum et al., 2000), locally linear embedding (LLE) (Roweis and Saul, 2000), Laplacian eigenmap (LE) (Belkin and Niyogi, 2003), Hessian LLE (HLLE) (Donoho and Grimes, 2003), local tangent space alignment (LTSA) (Zhang and Zha, 2004) and many others.
Among nonlinear manifold learning algorithms, LTSA has received wide attention since it is simple in geometric intuition and straightforward to implement. For data samples drawn from an m-dimensional manifold, LTSA first implements PCA on each neighborhood of data samples to get an m-dimensional subspace which approximates the local tangent space. LTSA then computes the local tangent coordinates of data samples and finally aligns them into a global coordinate system.
The performance of LTSA highly depends on the quality of the local tangent spaces approximated by PCA. However, the PCA approximation is accurate only when the following two assumptions hold:
- (A1)
data samples are uniformly distributed;
- (A2)
data samples in each local neighborhood of the manifold lie in or close to a linear subspace.
When data samples are sparse or non-uniformly distributed or when the data manifold has large curvatures, such assumptions can not be met, and the PCA approximation may be a bad estimation. This would make LTSA fail to reveal the manifold structure.
To overcome the drawbacks in using PCA for tangent space approximation, in this paper we propose a new method which can get accurate approximations to the local tangent spaces even when data samples are sparse or non-uniformly distributed or even when the data manifold has large curvatures. Compared with PCA, our method has the following two features.
- •
First, each data sample itself, other than the mean of its neighborhood samples (as in PCA), is used as the origin of its approximated tangent space. Then the approximated local tangent space will not biased when data samples are non-uniformly distributed or sparse.
- •
Secondly, the bases of the tangent space are obtained by minimizing a weighted sum of the projecting distances other than the sum of the projecting distances. Then the effect of curvatures can be taken care of when the data manifold has large curvatures.
Based on this new tangent space approximation, we propose an improved local tangent space alignment (ILTSA) algorithm which can reveal the underlying manifold structure by aligning the local tangent coordinates. Numerical experiments show that ILTSA can get faithful learning results even when data samples are sparse or non-uniformly distributed or even when the data manifold has large curvatures.
The remaining part of the paper is organized as follows. The PCA-based tangent space approximation as well as its limitations are described in Section 2. The ILTSA algorithm is presented in Section 3. A theoretical analysis of ILTSA is given in Section 4. Experimental results on both synthetic and real-world data sets are illustrated in Section 5. Some concluding remarks are stated in Section 6.
Section snippets
PCA-based tangent space approximation and its limitations
We first describe the basic steps of PCA-based tangent space approximation (PTSA) method and then illustrate its limitations using a synthetic example. Given a data manifold, the basic idea of PTSA is to find a linear subspace within each local neighborhood of the data manifold such that 1) the origin of the subspace is at the mean of the data samples of the neighborhood and 2) the sum is minimized of the projecting distances between the data samples of the neighborhood and their orthogonal
Improved local tangent space alignment algorithm
The improved local tangent space alignment algorithm (ILTSA) consists of two steps: local tangent space approximation and global alignment of local tangent coordinates. In Section 3.1, we propose a new method for local tangent space approximations. The global alignment step is then given in Section 3.2. Finally, the implementation details of ITLSA are stated in Section 3.3.
Theoretical analysis of ILTSA
In this section, we present a theoretical analysis of ILTSA. We first show why ILTSA can find more accurate tangent space approximations compared with LTSA by analyzing the error between the approximated tangent coordinates and their true values. We then explain why weights are introduced in minimizing the sum of the projecting distances.
Consider the application of ILTSA on an isometric Riemannian manifold . Suppose can be globally parameterized so for an open set Ω in , where F = [F1 F
Experimental results
In this section, we apply the ILTSA algorithm to both synthetic data and high-dimensional image data to test its performance. Since ILTSA can be viewed as an improved version of LTSA, we mainly compare the performance of ILTSA with LTSA and LPCA. In addition, comparison is also made between ILTSA and other popular manifold learning methods on a synthetic data set.
Conclusions and discussions
In this paper, we first proposed a new method to effectively approximate the local tangent spaces of a data manifold. Compared with PCA-based tangent space approximations, our method can provide more accurate approximation to the local tangent basis and coordinates. Then, based on this new approximation method, we proposed an improved LTSA algorithm (called ILTSA algorithm) which can align the local tangent coordinates into a single global coordinate system. Compared with the LTSA algorithm
Acknowledgements
This work was partly supported by the NNSF of China Grant No. 90820007, the Outstanding Youth Fund of the NNSF of China Grant No. 60725310, the 863 Program of China Grant No. 2007AA04Z228 and the 973 Program of China Grant No. 2007CB311002. The authors thank the referees for their invaluable comments and suggestions which helped improve the paper greatly.
References (16)
Data dimensionality estimation methods: a survey
Pattern Recognit.
(2003)- et al.
Intrinsic dimension estimation of manifolds by incising balls
Pattern Recognit.
(2009) - Asuncion, A., Newman, D., 2007. UCI Machine Learning Repository [<http://www.ics.uci.edu/mlearn/MLRepository.html>]....
- et al.
Laplacian eigenmaps for dimensionality reduction and representation
Neural Comput.
(2003) Riemannian Geometry
(1992)- et al.
Multidimensional Scaling
(2001) - et al.
Hessian eigenmaps: new locally linear embedding techniques for high-dimensional data
Proc. Nat. Acad. Sci. USA
(2003) Principal Component Analysis
(1999)
Cited by (59)
Transfer learning and clustering analysis of epileptic EEG signals on Riemannian manifold
2023, Applied Soft ComputingA novel signal representation in SEI: Manifold
2023, Journal of the Franklin InstituteA Novel Global-Local Feature Preserving Projection Method Based on Adaptive Linear Local Tangent Space Alignment for Process Monitoring
2022, Computer Aided Chemical EngineeringA novel method to recognize and classify based on an E-nose
2021, Measurement: Journal of the International Measurement ConfederationCitation Excerpt :When the assumption of linearity is not valid, we need to solve the problem by non-linear dimensionality reduction techniques, which do not require the linearity assumption and have been successfully adopted in various applications. There are some popular non-linear dimension reduction techniques such as isometric mapping (ISOMAP) [39], kernel PCA [40], diffusion maps [41], local tangent space alignment (LTSA) [42], and locally linear embedding (LLE) [43]. Among them, the LLE algorithm is an unsupervised dimensionality reduction algorithm that maps higher dimensional data into a lower-dimensional space while keeping the local spatial relationships.
Multi-manifold locality graph preserving analysis for hyperspectral image classification
2020, NeurocomputingCitation Excerpt :LTSA seeks to characterize the local geometry at each neighborhood via its tangent space, and performs a global optimization to align these local tangent spaces to learn the embedding. However, all these methods are non-linear DR methods, and they have no direct mapping relationship from high-dimensional space to low-dimensional space [26]. To overcome this problem, some linear manifold learning methods were proposed to approximate nonlinear ones, such methods include locality preserving projections (LPP) [27], neighborhood preserving embedding (NPE) [28] and linear LTSA (LLTSA) [29].
A generalized multi-dictionary least squares framework regularized with multi-graph embeddings
2019, Pattern Recognition