Locality adaptive preserving projections for linear dimensionality reduction

doi:10.1016/j.eswa.2020.113352

Expert Systems with Applications

Volume 151, 1 August 2020, 113352

https://doi.org/10.1016/j.eswa.2020.113352 Get rights and content

Highlights

•
Seeking the local structure in original feature space is shown to be error-prone.
•
We propose a locality adaptive projection approach for neighborhood preserving.
•
Experimental results demonstrate the feasibility of the proposed method.

Abstract

Dimensionality reduction techniques aim to transform the high-dimensional data into a meaningful reduced representation and have been consistently playing a fundamental role in the study of intrinsic dimensionality estimation and the design of an intelligent expert system towards real-world applications. From the perspective of manifold learning, locality preserving projections is a classical and commonly used dimensionality reduction method and it essentially learns the low-dimensional embedding under the constraint of preserving the local geometry of data. However, since it determines the neighborhood relationships in the original feature space that probably contains noisy and irrelevant features, the derived similarity between the neighbors are unreliable and the corresponding local data manifold tends to be error-prone, which inevitably leads to degraded performance for subsequent data analyses. Hence, how to accurately identify the true neighbor relationships for each sample remains crucial to the robustness improvement. In this work, we propose a novel approach, termed locality adaptive preserving projections (LAPP), to adaptively determine the neighbors and their relationships in the optimal subspace rather than in the original space. Specifically, due to the absence of prior knowledge of local properties of the underlying manifold, LAPP adopts a coarse-to-fine strategy to iteratively update the projected low-dimensional subspace and optimize the identification of the local structure of the data. Moreover, an iterative algorithm with fast convergence is utilized to solve the transformation matrix for explicit out-of-sample extension. Besides, LAPP is easy to implement and its key idea can be potentially extended to other methods for neighbor-finding and similarity measurement. To evaluate the performance of LAPP, we conduct comparative experiments on numerous synthetic and real-world datasets. Experimental results show that seeking the local structure in the original feature space misleads the selection of neighbors and the calculation of similarity and that the proposed method helps alleviate the negative effect of noisy and irrelevant features, which demonstrates its effectiveness. Besides, this study has the potential to enlighten relevant studies to consider the problem of optimizing the neighborhood relationships.

Introduction

For a variety of research fields and real-world applications that range from face recognition (He, Yan, Hu, Niyogi & Zhang, 2005) and smoke detection (Yuan, Xia, Shi, Li & Li, 2017) to activity recognition (Wang, Chen, Yang, Zhao & Chang, 2016) and finance management (Tayalı & Tolun, 2018; Zhong & Enke, 2017), we are often confronted with high-dimensional data and required to develop powerful analysis methods for the discovery of knowledge and the design of a decision support system, especially in the era of big data where we are faced with a massive amount of data that are characterized by complexity, variety, and high-dimensionality. Consequently, the prediction and evaluation models directly trained on such data not only suffer immensely from the curse of dimensionality, but also larger computational loads. Even worse, if the original feature space fails to reflect the intrinsic structure of the data, it leads to degraded performance and lowers the confidence of a decision system to a large extent (Qiao, Chen & Tan, 2010). For example, in terms of a face recognition system, we often organize a w*h face image into a w*h dimensional vector for appearance-based techniques, which is too large for robust face recognition (Bhowmik, Saha, Singha, Bhattacharjee & Dutta, 2019). In the task of building an intelligent expert system for daily stock market analyses, researchers usually collect a wide range of financial and economic features to maximize the stock market return. However, some of these features are irrelevant to the task and even redundant to each other (Zhong & Enke, 2017). Undoubtedly, this poses a serious challenge to the exploration of intrinsic dimensionality of the data, the efficiency of many machine learning models, and the generalization ability of a system for real-world scenarios. Accordingly, one common way to mitigate this problem is to utilize an effective and efficient dimensionality reduction method to reduce data dimensionality (Bhowmik et al., 2019; van der Maaten, Postma & Herik, 2009).

As an important preprocessing technique in data analysis, dimensionality reduction techniques basically work by transforming the data of high-dimensionality into a meaningful low-dimensional representation in a linear or non-linear way and they have been consistently playing a fundamental and important role in better revealing the intrinsic structure of the data and greatly facilitating the subsequent tasks (Zhao, Wang & Nie, 2018; Zhong & Enke, 2017). Particularly, dimensionality reduction contributes to the tasks of classification, regression, clustering, visualization, and data compression in a variety of applications such as face recognition, information retrieval, and disease diagnosis (Becht et al., 2019; van der Maaten & Hinton, 2008). For example, principle component analysis seeks a group of irrelevant variables by discarding redundant information and it helps reduce noise and improve the performance of a classifier. The key assumption behind dimensionality reduction is that the original feature space contains irrelevant features and some features are redundant to each other, we can then find a group of new features to represent the original ones (Tenenbaum, De Silva & Langford, 2000). Therefore, the task of dimensionality reduction is to find a reduced representation with the intrinsic dimensionality of the data by deriving an appropriately linear/non-linear transformation function under the carefully devised constraint conditions (van der Maaten et al., 2009; Zhao et al., 2018).

According to the requirement for the availability of data labels, existing dimensionality reduction techniques can be broadly categorized into three groups: supervised dimensionality reduction methods, unsupervised dimensionality reduction methods, and semi-supervised dimensionality reduction methods. Principle component analysis (PCA) is the most widely used unsupervised dimensionality reduction method and it attempts to seek a subspace by maximizing the variance of the projected data (Martínez & Kak, 2001). In contrast to PCA, linear discriminant analysis (LDA) utilizes the label information and it seeks the transformation matrix by simultaneously maximizing the rank of between-class scatter matrix and minimizing the rank of within-class scatter matrix in order to pull samples with the same label close and separate samples with different labels far from each other (Martínez & Kak, 2001). Though simple and intuitive, PCA and LDA are widely used in data reprocessing and perform well in a wealth of applications such as face recognition, seismic series analysis, visualization, and clustering (Belhumeur, Hespanha & Kriegman, 1997). However, both PCA and LDA only utilize the global structure of the data and assume there does not exist the local properties of the data, which limits their performance in handling complex cases when the above condition is not satisfied (Belhumeur et al., 1997; Martínez & Kak, 2001).

In contrast, another line of research is to explore the local properties of the data. From the perspective of manifold learning, dimensionality reduction essentially aims to find the low-dimensional manifold that is embedded into a high-dimensional space and this embedding keeps the data geometric characteristics as much as possible (Garcia-Vega & Castellanos-Dominguez, 2019; Tenenbaum et al., 2000). Accordingly, researchers have investigated the manifold learning and its application in dimensionality reduction. Isometric mapping (ISOMAP) (Tenenbaum et al., 2000), locally linear embedding (LLE) (Roweis & Saul, 2000), and Laplacian eigenmaps (LE) (He & Niyogi, 2004) are three representative local methods that find a lower-dimensional embedding of the data lying on or around a high-dimensional non-linear manifold. They have achieved satisfactory performance on multiple application domains (Krstanović et al., 2016; van der Maaten et al., 2009), however, they do not provide explicit mapping between the original data and the reduced representation. That is, researchers are generally required to recompute the projection vectors in coping with out-of-sample extension, which greatly limits their flexibility in use and leads to high time costs in processing streaming data. To allow for the efficient embedding of new datapoints, researchers have investigated the linearized version of several non-linear dimensionality reduction methods. For example, He, Cai, Yan and Zhang (2005) proposed the neighborhood preserving embedding (NPE) to linearly approximate LLE. Locality preserving projections (LPP) is a linear approximation to LE (He & Niyogi, 2004). Specifically, LPP is a commonly used and well-performing approach that attempts to obtain a linear transformation matrix by preserving the local neighborhood relationships of the data. LPP has a remarkable advantage in dimensionality reduction and returns an explicit mapping for serving the out-of-sample extension. Compared with most of existing manifold learning methods, LPP not only preserves the local properties of the data, but also returns an explicit transformation matrix. Particularly, the two components of LPP include the construction of neighbor graph and the measurement of similarity between neighbors, both of which largely determine its performance. In addition, several variants of LPP have been proposed and experimentally validated, such as the discriminant locality preserving projections (DLPP) that makes use of label information (Yu, Teng & Liu, 2006) and the null space discriminant locality preserving projections (NDLPP) that is targeted at the small sample size problem of DLPP (Yang, Gong, Gu, Li & Liang, 2008; Yu et al., 2006).

Although LPP and its variants have been successfully applied for real-world applications, it takes the risk of choosing false nearest neighbors and incorrectly calculating the similarity between neighbors and the derived local manifold tends to be error-prone, which inevitably leads to degraded performance for subsequent data analyses. This is mainly because LPP measures the similarity between neighbors in the original feature space where there exist noisy and irrelevant features (Wang et al., 2016; Zhao et al., 2018). Obviously, the obtained neighbor relationships in the optimal subspace are more reliable than the ones in the original feature space and can better reflect the truth. Therefore, how to maximally mitigate the effect of noisy factors and accurately identify the true neighbor relationships for each sample remains crucial. However, we have no prior knowledge of the optimal subspace, which poses a challenge to the determination of the true similarity between neighbors and the robustness improvement of manifold learning-based methods. Accordingly, in this study, we propose a novel approach, termed locality adaptive preserving projections (LAPP) to adaptively determine the neighborhood relationships in the optimal subspace rather than in the original feature space. Specifically, due to the absence of prior knowledge of local properties of the underlying manifold, LAPP adopts a coarse-to-fine strategy to handle the chicken and egg situation. Moreover, an iterative algorithm with fast convergence is utilized to solve the constrained optimization problem for explicit out-of-sample extension. This enables us to better reveal the underlying manifold and obtain corresponding robust embeddings. Particularly, the main contributions of this study are as follows. Frist, we analyze the manifold learning-based dimensionality reduction techniques, especially the commonly used LPP and point out that seeking the local structure in original feature space is error-prone in terms of neighbor-finding and similarity measurement. This potentially motivates researchers to pay special attention to such a problem for other dimensionality reduction methods. Second, we propose a locality adaptive preserving projections approach to optimizing the measurement of neighbor relationships. The proposed method iteratively updates the projected low-dimensional subspace and optimizes the identification of the local structure of the data. Besides, its key idea can be potentially extended to other similar methods. Third, we implement and evaluate the proposed approach on numerous synthetic and real-world datasets. Extensive experimental results show that constructing the neighbor graph in the original feature space suffers from lower performance, which demonstrates the effectiveness of the proposed method.

The reminder of this study is organized as follows. Section 2 briefly reviews related work on dimensionality reduction techniques by introducing four commonly used methods. We detail the proposed locality adaptive preserving projections method and its motivation in Section 3. Section 4 gives the experimental setup and results on both synthetic and real-world datasets and presents corresponding analyses. The last section concludes this study with a brief summary and discusses insightful future research directions.

Section snippets

Related work

Over the past few decades, a large number of dimensionality reduction methods have been proposed and used in diverse areas (e.g., decision support systems, face recognition, and data visualization), and we can categorize them from different perspectives. According to whether the mapping function between the high-dimensional space and the reduced feature space is linear, we can group dimensionality reduction techniques into linear methods (e.g., PCA and LDA) and non-linear methods (e.g., LLE,

Locality adaptive preserving projections

As we discussed above, the selection of nearest neighbors and the measurement of neighborhood relationships largely determines the performance of locality preserving projections. Particularly, how to largely reduce the effect of unimportant factors and accurately seek the neighbors in the optimal subspace remains critical. Ideally, if we have prior knowledge of the noisy and irrelevant features of the data, we get the true pairwise distances between neighbors and derive the optimal feature

Experimental results and analysis

To evaluate the effectiveness of the proposed method, we conduct extensive experiments on two synthetic Swiss roll datasets, three face recognition benchmark datasets, including the Yale face database (YALE), Olivetti research laboratory database (ORL), and extended Yale Face Database B (E_YALE), as well as one hand-written digit recognition dataset MNIST. We compare LAPP with other four well-performing dimensionality reduction methods, including two global methods (PCA and LDA) and two local

Conclusions

Dimensionality reduction techniques have been consistently playing an important role in the procedure of data analysis and the design of an intelligent expert system such as face recognition and disease diagnosis, and they significantly facilitate, among other tasks, the classification, clustering, visualization, and compression in handling the data of high-dimensionality. Due to the complex non-linear relations inherent in data, however, dimensionality reduction methods that preserve the local

CRediT authorship contribution statement

Aiguo Wang: Conceptualization, Methodology, Formal analysis, Investigation, Writing - original draft, Writing - review & editing. Shenghui Zhao: Validation, Formal analysis, Writing - review & editing. Jinjun Liu: Writing - review & editing. Jing Yang: Formal analysis, Writing - review & editing. Li Liu: Writing - original draft, Writing - review & editing. Guilin Chen: Conceptualization, Methodology, Supervision, Project administration.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

This work was funded by the National Natural Science Foundation of China (No. 61902068), the Key Research and Development Project of Anhui Province (No. KJ2019ZD44), the Major Special Projects of Anhui Province(No. 201903A06020026), and the Anhui Provincial Natural Science Foundation (No. 1908085MF211).

References (35)

M.K. Bhowmik et al.
Enhancement of robustness of face recognition system through reduced gaussianity in Log-ICA
Expert Systems with Applications
(2019)
S. Garcia-Vega et al.
Similarity preservation in dimensionality reduction using a kernel-based cost function
Pattern Recognition Letters
(2019)
L. Krstanović et al.
GMMs similarity measure based on LPP-like projection of the parameter space
Expert Systems with Applications
(2016)
L. Qiao et al.
Sparsity preserving projections with applications to face recognition
Pattern Recognition
(2010)
Y. Song et al.
A unified framework for semi-supervised dimensionality reduction
Pattern Recognition
(2008)
H. Tayalı et al.
Dimension reduction in mean-variance portfolio optimization
Expert Systems with Applications
(2018)
L. Yang et al.
Null space discriminant locality preserving projections for face recognition
Neurocomputing
(2008)
W. Yang et al.
Fast neighborhood component analysis
Neurocomputing
(2012)
W. Yu et al.
Face recognition using discriminant locality preserving projections
Image and Vision Computing
(2006)
H. Zhang et al.
Auto-weighted 2-dimensional maximum margin criterion
Pattern Recognition
(2018)

H. Zhao et al.

Adaptive neighborhood MinMax projections

Neurocomputing

(2018)

X. Zhong et al.

Forecasting daily stock market return using dimensionality reduction

Expert Systems with Applications

(2017)

E. Becht et al.

Dimensionality reduction for visualizing single-cell data using UMAP

Nature Biotechnology

(2019)

P. Belhumeur et al.

Eigenfaces vs. fisherfaces: Recognition using class specific linear projection

IEEE Transactions on Pattern Analysis on Machine Intelligence

(1997)

D. Cai et al.

Orthogonal laplacianfaces for face recognition

IEEE Transactions on Image Processing

(2006)

X. He et al.

Neighborhood preserving embedding

X. He et al.

Locality preserving projections

Cited by (31)

Locality Preserving Projections with Autoencoder
2024, Expert Systems with Applications
Locality Preserving Projections (LPP) is a popular dimensionality reduction method in the manifold learning field. However, LPP and all its variants only consider the one-way mapping from the high-dimensional space to the low-dimensional space and have no reverse verification, resulting in inaccurate low-dimensional embeddings. In this paper, we propose a new LPP method, called LPPAE (Locality Preserving Projections with Autoencoder), based on the linear Autoencoder. It constructs a two-way mapping: at the encoding stage, the conventional projection of LPP is viewed as a mapping from the high-dimensional space to the low-dimensional space. At the decoding stage, the low-dimensional embeddings are mapped back to the original high-dimensional space. The main contributions of the new method are: (1) This design not only preserves the neighborhood relationship of the data but more importantly, ensures that the low-dimensional embeddings can more accurately ”represent” the original data, thus significantly improving the performance of LPP. Experimental results on Handwritten Alphadigits, COIL-20, Yale, AR datasets show that the recognition rates of LPPAE are 26.06, 10.09, 5.40, and 8.86% higher than those of the original LPP respectively. On the MNIST dataset, compared to some of the latest improvements of LPP, including LPPMDC, LAPP, LPP+TR, and DNLPP, the recognition rate of LPPAE has been improved by 12.50, 38.10, 9.10, and 2.61%, respectively. (2) LPPAE regards the conventional LPP as an encoder, which is a new perspective. The idea of LPPAE can be used as a general framework and then extended to other manifold learning methods, and then a series of new methods can be developed.
P2S distance induced locally conjugated orthogonal subspace learning for feature extraction
2024, Expert Systems with Applications
When performing data classification tasks, it often occurs to them the curse of dimensionality problem. To address the issue, a manifold learning method termed locally conjugated orthogonal subspace (LCOS) is put forward for dimensionality reduction or feature extraction in this paper. Note that point to feature space (P2S) distance contributes to mining local geometry information, both a local margin characterizing data apartness and a locally conjugated orthogonal constraint beneficial to removing data redundancy are well studied from the P2S distance metric. They are all exploited to model the proposed LCOS. Then, a low dimensional subspace can be explored by maximizing the P2S distance induced local margin under the constraint. Compared with some other related dimensionality reduction methods, experimental results on benchmark face and object data sets validate the performance of the proposed method.
Fast anchor graph preserving projections
2024, Pattern Recognition
The existing graph-based dimensionality reduction algorithms need to learn an adjacency matrix or construct it in advance, therefore the time complexity of the graph-based dimensionality reduction algorithms is not less than $O (n^{2} d)$ , where $n$ denotes the number of samples, $d$ denotes the number of dimensions. Moreover, the existing dimensionality reduction algorithms do not consider the cluster information in the original space, resulting in the weakening or even loss of valuable information after dimensionality reduction. To address the above problems, we propose Fast Anchor Graph Preserving Projections (FAGPP), which learns the projection matrix, the anchors and the membership matrix at the same time. Especially, FAGPP has a built-in Principal Component Analysis (PCA) item, which makes our model not only deal with the cluster information of data, but also deal with the global information of data. The time complexity of FAGPP is $O (n m d)$ , where $m$ denotes the number of the anchors and $m$ is much less than $n$ . We propose a novel iterative algorithm to solve the proposed model and the convergence of the algorithm is proved theoretically. The experimental results on a large number of high-dimensional benchmark image data sets demonstrate the efficiency of FAGPP. The data sets and the source code are available from https://github.com/511lab/FAGPP.
Noise-aware clustering based on maximum correntropy criterion and adaptive graph regularization
2023, Information Sciences
Graph-based clustering is a basic subject in the field of machine learning, but most of them still have the following deficiencies. First, similarity graph construction and data division into corresponding classes are always divided into two independent steps. Second, noise contained in real data may cause the learned similarity graph to be inaccurate. Third, the traditional metrics based on Euclidean distance is difficult to tackle non-Gaussian noise. In order to eliminate these limitations, a noise-aware clustering based on correntropy and adaptive graph regularization method (NCCAGR) is proposed. 1) In order to change the problem from two-steps to single-step, we formulate a joint clustering learning framework that simultaneously learns a robust similarity graph and performs data clustering; 2) To overcome the influence of noise, we construct a Laplacian matrix and perform adaptive graph regularization based on clean data; 3) By introducing the correntropy to solve the problem of non-Gaussian noise and heavy tail in the original data. Furthermore, a half-quadratic optimization method is used to transform the problem into a quadratic form to facilitate subsequent solutions. Finally, experiments show that the proposed method not only has high performance, but also outperforms both classical methods and state-of-the-art methods in robustness.
Linear dimensionality reduction method based on topological properties
2023, Information Sciences
Dimensionality reduction is an important data preprocessing technique that has been extensively studied in machine learning and data mining. Locality Preserving Projection (LPP) is a widely used linear unsupervised dimensionality reduction method, which maps high-dimensional data into low-dimensional subspace through linear transformation. Although various variants of LPP have been proposed to tackle different drawbacks of LPP, it is identified in this article that LPP does not possess the important topological property of translation invariance, that is, the linear transformation given by LPP is strongly related to the relative position between the data and the origin of the coordinate system. In this article, we theoretically analyze the reason why this drawback exists in LPP and propose to resolve it by introducing a kind of centralization to the model. Moreover, as topological properties are prominent information to characterize the structure of the data, this article proposes a further improvement of LPP to maintain topological connectivity of data after dimensionality reduction. Experiments on multiple synthetic and real-world datasets show that the new model incorporating topological properties outperforms not only the original LPP model but also several other classic linear or non-linear dimensionality reduction methods.
A novel pattern classification integrated GLPP with improved AROMF for fault diagnosis
2023, Process Safety and Environmental Protection
With the scale expansion of industrial processes, safety has become one of its important links and the requirements for safety monitoring are getting higher. How to realize timely and effective fault diagnosis, especially for incipient faults, has attracted more discussion and research. This paper proposes a novel pattern classification integrated global−local preserving projections (GLPP) and improved adaptive rank-order morphological filter (IAROMF) for fault diagnosis. First, in order to preserve the global manifold information and local manifold information of the data, GLPP is introduced to extract the features of the data to obtain the test signal and the template signal. Second, AROMF transformation is performed on the test signal and template signal to obtain the output trend feature. Third, as the pattern matching by Euclidean distance-based AROMF has the restriction of sequence timing and the feature points need to be strictly corresponding, the Weighted Dynamic Time Warping (WDTW) distance is used to calculate the total error of iteration between the template trend and the output trend. In order to prove the effectiveness of the method proposed in this paper, a case study was carried out on the Tennessee Eastman (TE) process. The experiment results illustrated that the novel pattern classification method proposed in this paper has higher diagnostic accuracy than other fault diagnosis methods, especially for incipient faults.

View all citing articles on Scopus

View full text

Locality adaptive preserving projections for linear dimensionality reduction

Highlights

Abstract

Introduction

Section snippets

Related work

Locality adaptive preserving projections

Experimental results and analysis

Conclusions

CRediT authorship contribution statement

Declaration of Competing Interest

Acknowledgments

Expert Systems with Applications

Pattern Recognition Letters

Expert Systems with Applications

Pattern Recognition

Pattern Recognition

Expert Systems with Applications

Neurocomputing

Neurocomputing

Image and Vision Computing

Pattern Recognition

Neurocomputing

Expert Systems with Applications

Dimensionality reduction for visualizing single-cell data using UMAP

Nature Biotechnology

Eigenfaces vs. fisherfaces: Recognition using class specific linear projection

IEEE Transactions on Pattern Analysis on Machine Intelligence

Orthogonal laplacianfaces for face recognition

IEEE Transactions on Image Processing

Neighborhood preserving embedding

Locality preserving projections