Kernel class-wise locality preserving projection

doi:10.1016/j.ins.2007.12.001

Information Sciences

Volume 178, Issue 7, 1 April 2008, Pages 1825-1835

https://doi.org/10.1016/j.ins.2007.12.001 Get rights and content

Abstract

In the recent years, the pattern recognition community paid more attention to a new kind of feature extraction method, the manifold learning methods, which attempt to project the original data into a lower dimensional feature space by preserving the local neighborhood structure. Among them, locality preserving projection (LPP) is one of the most promising feature extraction techniques. However, when LPP is applied to the classification tasks, it shows some limitations, such as the ignorance of the label information. In this paper, we propose a novel local structure based feature extraction method, called class-wise locality preserving projection (CLPP). CLPP utilizes class information to guide the procedure of feature extraction. In CLPP, the local structure of the original data is constructed according to a certain kind of similarity between data points, which takes special consideration of both the local information and the class information. The kernelized (nonlinear) counterpart of this linear feature extractor is also established in the paper. Moreover, a kernel version of CLPP namely Kernel CLPP (KCLPP) is developed through applying the kernel trick to CLPP to increase its performance on nonlinear feature extraction. Experiments on ORL face database and YALE face database are performed to test and evaluate the proposed algorithm.

Introduction

In pattern recognition, feature extraction techniques are widely employed to reduce the dimensionality of data and to enhance the discriminatory information [7]. Most feature extraction methods have focused on finding the linear transformation to project the data from a high-dimensional input space into a lower dimensional feature space, and the feature vector in the feature space contains all the necessary discriminative information. In the past several decades, many dimensionality reduction techniques have been proposed. The most well-known feature extraction methods may be principal component analysis (PCA) [1] and linear discriminant analysis (LDA) [2]. PCA seeks a linear optimal transformation matrix to minimize the mean squared error criterion, and the optimal matrix is constituted by the largest eigenvectors (called principal components) of the sample covariance. The purpose of PCA is to keep the information in terms of variance as much as possible. Linear discriminant analysis (also called Fisher’s linear discriminant) is another popular linear dimensionality reduction method. In many applications, LDA has proven to be much more effective than PCA. In the previous work, PCA was generalized to the nonlinear curves such as principal curves [6] and its extension such as principal surfaces [3]. Principal curves and principal surfaces are the nonlinear generalizations of principal components and subspaces respectively. It has turned out that discretized principal curves are essentially equivalent to self-organizing maps (SOM) [13], [16]. SOM is a nonparametric latent variable model with a topological constraint, such as lines, squares, or hexagonal grids and its mapping is similar to a discrete self-similarity principle for a principal manifold. SOM is a data driven dimensionality reduction method. SOM is regarded as an approximation of the principal surface. SOM was extended to Visualisation-induced SOM (ViSOM) [27]. ViSOM represents a discrete principal curve or surface, and ViSOM produces a smooth and graded mesh in the data space and captures the nonlinear manifold of data [25]. Moreover, other nonlinear manifold algorithms have been proposed, such as Locally Linear Embedding (LLE) [17] and Isomap [22]. LLE regards dimensionality reduction as geometrical perspective, while Isomap utilizes geodesic distances to represent connected graphs and relationship among data. Both LLE and Isomap preserve the neighborhoods and geometric relationships of the data. Isomap and LLE map easily the training data points in the reduced dimensional space, but it is hard to locate the test data points. Locality preserving projection [8] locates easily the new data point in the reduced representation space. But LPP is a linear dimensionality reduction method and has no the sufficient nonlinear discriminant power for linearly non-separable classes. Kernel methods have been widely used to overcome the limitation of some linear feature extraction and classification. Kernel-based learning feature extraction methods, such as Kernel Principal Component Analysis (KPCA) [19], [11], Kernel Discriminant Analysis (KDA) [14], [26] and Support Vector Machine (SVM) [10], [24], were widely used in the pattern recognition and machine learning areas [4], [23]. Other research topics relative to kernel-based learning have attracted researchers more and more attention [12], [15], [18], [20]. LPP was successfully applied in the pattern recognition and information retrieval areas. For example, LPP based feature extraction method namely Laplacianfaces was proposed for face recognition [9]. LPP constructs the adjacency graph by doing the nearest neighbor search, and the original data are mapped to the low dimensional space for feature extraction. LPP performs well on many practical applications, such as audio, video, text documents retrieval. Moreover, all of these methods are completely unsupervised with regard to the class labels of the data, and have little to do with discriminative features optimal for classification. In this paper, a novel local structure based feature extraction method based on the idea of LPP, namely class-wise locality preserving projection (CLPP), is proposed to enhance the class structure of the data. Unlike the unsupervised learning scheme of LPP, CLPP follows the supervised learning scheme, i.e. it uses the class information to model the manifold structure. In CLPP, the local structure of the original data is constructed according to a certain way of constructing the nearest neighbor graph, which takes special consideration of both the local information and the class information. Moreover, we improve CLPP on nonlinear feature extraction with kernel trick to develop Kernel CLPP algorithm and make CLPP a robust technique for the feature extraction tasks.

The rest of this paper is organized as follows. The detailed formulation and development of KCLPP are described in Section 2. An evaluation of the proposed method on two databases is reported in Section 3. Finally, Section 4 draws a conclusion and opens a prospective for future work.

Section snippets

Kernel class-wise locality preserving projection (KCLPP)

In this section, firstly we review LPP algorithm briefly, and secondly we present the CLPP algorithm and its kernel version in detail, and finally we introduce the detailed procedure of the proposed algorithm on feature extraction and classification.

Experimental results and discussion

In this section, we implement experiments on ORL and YALE databases to evaluate the proposed algorithm. Firstly we select the procedure parameters with cross-validation method, i.e., k for k nearest neighbor measure, δ for similarity measure, kernel parameters, and secondly we evaluate the performance of proposed algorithm on computation efficiency and recognition accuracy.

Conclusion and future work

A novel supervised feature extraction method namely class-wise locality preserving projection (CLPP) and its kernel extension are proposed in this paper. Main contributions are summarized as follows. (1) Both the local structure and class labels are taken enough consideration for feature extraction based on CLPP algorithm. (2) CLPP guides the procedure of constructing the nearest neighbor graph with the information labels, and CLPP achieves the higher computation efficiency compared with LPP

Acknowledgement

The authors thank the anonymous reviewers for their constructive comments.

References (27)

Mi-yung Cho
A new gesture recognition algorithm and segmentation method of Korean scripts for gesture-allowed ink editor
Information Sciences
(2006)
Yi-Chung Hu
Fuzzy integral-based perceptron for two-class pattern classification problems
Information Sciences
(2007)
Bo Jin et al.
Support vector machines with genetic fuzzy feature transformation for biomedical data classification
Information Sciences
(2007)
Chih-Yang Tsai
On detecting nonlinear patterns in discriminant problems
Information Sciences
(2006)
Hujun Yin
Data visualisation and manifold mapping using the ViSOM
Neural Networks
(2002)
Z. Zhu et al.
Self-organizing learning array and its application to economic and financial problems
Information Sciences
(2007)
P.N. Belhumeur et al.
Eigenfaces vs. Fisherfaces: recognition using class specific linear projection
IEEE Transactions on Pattern Analysis and Machine Intelligence
(1997)
A.U. Batur, M.H. Hayes, Linear subspace for illumination robust face recognition, in: Proc. IEEE Intl. Conf. Computer...
Kui-Yu Chang et al.
A unified model for probabilistic principal surfaces
IEEE Transactions on Pattern Analysis and Machine Intelligence
(2001)
G.H. Golub et al.
Matrix Computations
(1996)

T. Hastie et al.

Principal curves

Journal of the American Statistical Association

(1989)

X. He, P. Niyogi, Locality preserving projections, in: Proc. Conf. Advances in Neural Information Processing Systems,...

X. He et al.

Face recognition using Laplacianfaces

IEEE Transaction on Pattern Analysis and Machine Intelligence

(2005)

Cited by (122)

Linear dimensionality reduction method based on topological properties
2023, Information Sciences
Dimensionality reduction is an important data preprocessing technique that has been extensively studied in machine learning and data mining. Locality Preserving Projection (LPP) is a widely used linear unsupervised dimensionality reduction method, which maps high-dimensional data into low-dimensional subspace through linear transformation. Although various variants of LPP have been proposed to tackle different drawbacks of LPP, it is identified in this article that LPP does not possess the important topological property of translation invariance, that is, the linear transformation given by LPP is strongly related to the relative position between the data and the origin of the coordinate system. In this article, we theoretically analyze the reason why this drawback exists in LPP and propose to resolve it by introducing a kind of centralization to the model. Moreover, as topological properties are prominent information to characterize the structure of the data, this article proposes a further improvement of LPP to maintain topological connectivity of data after dimensionality reduction. Experiments on multiple synthetic and real-world datasets show that the new model incorporating topological properties outperforms not only the original LPP model but also several other classic linear or non-linear dimensionality reduction methods.
Discriminative and Geometry-Preserving Adaptive Graph Embedding for dimensionality reduction
2023, Neural Networks
Learning graph embeddings for high-dimensional data is an important technology for dimensionality reduction. The learning process is expected to preserve the discriminative and geometric information of high-dimensional data in a new low-dimensional subspace via either manual or automatic graph construction. Although both manual and automatic graph constructions can capture the geometry and discrimination of data to a certain degree, they working alone cannot fully explore the underlying data structure. To learn and preserve more discriminative and geometric information of the high-dimensional data in the low-dimensional subspace as much as possible, we develop a novel Discriminative and Geometry-Preserving Adaptive Graph Embedding (DGPAGE). It systematically integrates manual and adaptive graph constructions in one unified graph embedding framework, which is able to effectively inject the essential information of data involved in predefined graphs into the learning of an adaptive graph, in order to achieve both adaptability and specificity of data. Learning the adaptive graph jointly with the optimized projections, DGPAGE can generate an embedded subspace that has better pattern discrimination for image classification. Results derived from extensive experiments on image data sets have shown that DGPAGE outperforms the state-of-the-art graph-based dimensionality reduction methods. The ablation studies show that it is beneficial to have an integrated framework, like DGPAGE, that brings together the advantages of manual/adaptive graph construction.
Kernelized Supervised Laplacian Eigenmap for Visualization and Classification of Multi-Label Data
2022, Pattern Recognition
We had previously proposed a supervised Laplacian eigenmap for visualization (SLE-ML) that can handle multi-label data. In addition, SLE-ML can control the trade-off between the class separability and local structure by a single trade-off parameter. However, SLE-ML cannot transform new data, that is, it has the “out-of-sample” problem. In this paper, we show that this problem is solvable, that is, it is possible to simulate the same transformation perfectly using a set of linear sums of reproducing kernels (KSLE-ML) with a nonsingular Gram matrix. We experimentally showed that the difference between training and testing is not large; thus, a high separability of classes in a low-dimensional space is realizable with KSLE-ML by assigning an appropriate value to the trade-off parameter. This offers the possibility of separability-guided feature extraction for classification. In addition, to optimize the performance of KSLE-ML, we conducted both kernel selection and parameter selection. As a result, it is shown that parameter selection is more important than kernel selection.
We experimentally demonstrated the advantage of using KSLE-ML for visualization and for feature extraction compared with a few typical algorithms.
A survey on Laplacian eigenmaps based manifold learning methods
2019, Neurocomputing
As a well-known nonlinear dimensionality reduction method, Laplacian Eigenmaps (LE) aims to find low dimensional representations of the original high dimensional data by preserving the local geometry between them. LE has attracted great attentions because of its capability of offering useful results on a broader range of manifolds. However, when applying it to some real-world data, several limitations have been exposed such as uneven data sampling, out-of-sample problem, small sample size, discriminant feature extraction and selection, etc. In order to overcome these problems, a large number of extensions to LE have been made. So in this paper, we make a systematical survey on these extended versions of LE. Firstly, we divide these LE based dimensionality reduction approaches into several subtypes according to different motivations to address the issues existed in the original LE. Then we successively discuss them from strategies, advantages or disadvantages to performance evaluations. At last, the future works are also suggested after some conclusions are drawn.
Face recognition based on Volterra kernels direct discriminant analysis and effective feature classification
2018, Information Sciences
Citation Excerpt :
However, these methods demonstrate reduced efficiency when dealing with expression and illumination changes. Several advanced methods based on global feature representation [3,6,16,17,19,25,44] were proposed recently to enhance the discriminative capability of the learned feature. Local features are more stable than global features when they are used to deal with local changes, such as illumination, expression, and inaccurate alignment.
We present a novel face recognition method based on direct discriminant Volterra kernels and effective feature classification (DD-VK). One of the crucial steps involves dividing face images into patches and using the DD-VK method to extract the features of sub-image patches. DD-VK implements diagonalization to discard useless information in the null space of the inter-class scatter matrix and preserve important discriminant information in the null space of the intra-class scatter matrix. This method can simultaneously maximize inter-class distances and minimize intra-class distances in the feature space. We also introduce a novel classification scheme associated with the 2D Volterra kernel feature. Our scheme aggregates the classification information obtained from each column of the feature matrix in each image patch and uses a voting strategy to implement parent face image classification. This procedure can reduce the influence of local negative information. Experimental results show that the proposed method demonstrates good performance when dealing with conventional face recognition problems and exhibits strong robustness when dealing with block occlusion images.
Locality-regularized linear regression discriminant analysis for feature extraction
2018, Information Sciences
Locality-regularized linear regression classification (LLRC) is an effective classifier that shows great potential for face recognition. However, the original feature space cannot guarantee the classification efficiency of LLRC. To alleviate this problem, we propose a novel dimensionality reduction method called locality-regularized linear regression discriminant analysis (LLRDA) for feature extraction. The proposed LLRDA is developed according to the decision rule of LLRC and seeks to generate a subspace that is discriminant for LLRC. Specifically, the intra-class and inter-class local reconstruction scatters are first defined to characterize the compactness and separability of samples, respectively. Then, the objective function for LLRDA is derived by maximizing the inter-class local reconstruction scatter and simultaneously minimizing the intra-class local reconstruction scatter. Extensive experimental results on CMU PIE, ORL, FERET, and Yale-B face databases validate the effectiveness of our proposed method.

View all citing articles on Scopus

View full text

Kernel class-wise locality preserving projection

Abstract

Introduction

Section snippets

Kernel class-wise locality preserving projection (KCLPP)

Experimental results and discussion

Conclusion and future work

Acknowledgement

Information Sciences

Information Sciences

Information Sciences

Information Sciences

Neural Networks

Information Sciences

Eigenfaces vs. Fisherfaces: recognition using class specific linear projection

IEEE Transactions on Pattern Analysis and Machine Intelligence

A unified model for probabilistic principal surfaces

IEEE Transactions on Pattern Analysis and Machine Intelligence

Matrix Computations

Principal curves

Journal of the American Statistical Association

Face recognition using Laplacianfaces

IEEE Transaction on Pattern Analysis and Machine Intelligence