Flexible constrained sparsity preserving embedding

doi:10.1016/j.patcog.2016.06.027

Pattern Recognition

Volume 60, December 2016, Pages 813-823

https://doi.org/10.1016/j.patcog.2016.06.027 Get rights and content

Highlights

•
Two non-linear semi-supervised embeddings are proposed.
•
These methods elegantly integrate sparsity preserving and constrained embedding.
•
The second framework provides a non-linear embedding and its out-of-sample extension.
•
Classification performance after embedding is assessed on eight image datasets.
•
KNN and SVM classifiers are used after getting the embedding.
•
Experimental results on eight public image datasets show the outperformance of the methods.

Abstract

In this paper, two semi-supervised embedding methods are proposed, namely Constrained Sparsity Preserving Embedding (CSPE) and Flexible Constrained Sparsity Preserving Embedding (FCSPE). CSPE is a semi-supervised embedding method which can be considered as a semi-supervised extension of Sparsity Preserving Projections (SPP) integrated with the idea of in-class constraints. Both the labeled and unlabeled data can be utilized within the CSPE framework. However, CSPE does not have an out-of-sample extension since the projection of the unseen samples cannot be obtained directly. In order to have an inductive semi-supervised learning, i.e. being able to handle unseen samples, we propose FCSPE which can simultaneously provide a non-linear embedding and an approximate linear projection in one regression function. FCSPE simultaneously achieves the following: (i) the local sparse structures is preserved, (ii) the data samples with a same label are mapped onto one point in the projection space, and (iii) a linear projection that is the closest one to the non-linear embedding is estimated. Experimental results on eight public image data sets demonstrate the effectiveness of the proposed methods as well as their superiority to many competitive semi-supervised embedding techniques.

Introduction

In many real world applications, such as face recognition and text categorization, the data are usually provided in a high dimension space. In many real-world problems, collecting a large number of labeled samples is practically impossible. The reasons are twofold. Firstly, these labeled samples can be very few. Secondly, acquiring labels requires expensive human labor. To deal with this problem, semi-supervised embedding methods can be used to project the data in the high-dimensional space into a space with fewer dimensions.

A lot of methods for dimension reduction have proposed. Principal Component Analysis [1] (PCA) and Multidimensional Scaling [2] (MDS) are two classic linear unsupervised embedding methods. Linear Discriminant Analysis [1] (LDA) is a supervised method. In 2000, Locally Linear Embedding [3] (LLE) and Isometric Feature Mapping (ISOMAP) [4] were separately proposed in science which laid a foundation of manifold learning. Soon afterward, Belkin et al. proposed Laplacian Eigenmaps [5] (LE). He et al. proposed both Locality Preserving Projection [6] (LPP), essentially a linearized version of LE, and Neighborhood Preserving Embedding [7] (NPE), a linearized version of LLE. LPP and NPE can be interpreted in a general graph embedding framework with different choices of graph structure. Most of these methods are unsupervised methods. Afterwards, sparse representation [8], [9], [10] based methods have attracted extensive attention. Lai et al. proposed a 2-D feature extraction method called sparse 2-D projections for image feature extraction [11]. In [12], a robust tensor learning method called sparse tensor alignment (STA) is then proposed for unsupervised tensor feature extraction based on the alignment framework. In [13], multilinear sparse principal component analysis (MSPCA) inherits the sparsity from the sparse PCA and iteratively learns a series of sparse projections that capture most of the variation of the tensor data.

Sparsity Preserving Projection (SPP) is an unsupervised learning method [10]. It can be considered as an extension to NPE since the latter has a similar objective function. However, SPP utilizes sparse representation over the whole data to obtain the affinity matrix.

In the last decade, semi-supervised learning algorithms have been developed to effectively utilize a large amount of unlabeled samples as well as the limited number of labeled samples for real world applications [14], [15], [16], [17], [18], [19], [20], [21], [22]. In the past years, many graph-based methods for semi-supervised learning have been developed [23], [24], [25], [26], [27], [28], [29], [30], [31], [32], [33], [34], [35].

Constrained Laplacian Eigenmaps [36] (CLE) is a semi-supervised embedding method. CLE constrains the solution space of Laplacian Eigenmaps only to contain embedding results that are consistent with the labels. Labeled points belonging to the same class are merged together, labeled points belonging to different classes are separated, and similar points are close to one another. Similarly, Constrained Graph Embedding [37] (CGE) tries to project the data points from a same class onto one single point in the projection space with a constraint matrix.

Flexible Manifold Embedding [38] (FME) is a label propagation method. FME simultaneously estimates the non-linear embedding of unlabeled samples and the linear regression over these non-linear representations. In [39], the authors propose a whole learning process that can provide the data graph and a linear regression within a same framework.

SPP is a successful unsupervised learning method. To extend SPP to a semi-supervised embedding method, we introduce the idea of in-class constraints in CGE into SPP and propose a new semi-supervised method for data embedding named Constrained Sparsity Preserving Embedding (CSPE). The weakness of CSPE is that it can not handle the new coming samples which means a cascade regression should be performed after the non-linear mapping is obtained by CSPE over the whole training samples. Inspired by FME, we add a regression term in the objective function to obtain an approximate linear projection simultaneously when non-linear embedding is estimated and proposed Flexible Constrained Sparsity Preserving Embedding (FCSPE). So in this paper, two semi-supervised embedding methods namely CSPE and FCSPE are proposed. Compared to the existing works, the proposed CSPE retains the advantages of both CGE and SPP. On the other hand, the proposed FCSPE simultaneously estimates the non-linear mapping over the training samples and the linear projection for solving the out-of-sample problem, which is usually not provided by existing graph-based semi-supervised non-linear mapping methods.

This paper is organized as follows. Section 2 reviews the related methods including LPP, SPP, CGE and FME. Section 3 introduces the two proposed semi-supervised methods. Section 4 presents performance evaluations on six face image databases: Yale, ORL, FERET, PIE, Extended Yale B and LFW (the original version and the aligned version), one handwriting image database USPS and an object image database COIL-20. Section 5 presents some concluding remarks.

Section snippets

Related work

Some mathematical notations are listed and will be used in the next several sections. Let $X = [x_{1}, x_{2}, \dots, x_{n}] \in R^{m \times n}$ be the data matrix, where n is the number of training samples and m is the dimension of each sample. Let $y = {[y_{1}, y_{2}, \dots, y_{n}]}^{T}$ be a one-dimensional map of $X$ . Under a linear projection $y^{T} = p^{T} X$ , each data point $x_{i}$ in the input space $R^{m}$ is mapped into $y_{i} = p^{T} x_{i}$ in the real line. Here $p \in R^{m}$ is a projection axis. Let $Y \in R^{d \times n}$ be the data projections in a d dimensional space.

Proposed methods

In this section, we propose a semi-supervised learning method named Constrained Sparsity Preserving Embedding (CSPE). Afterwards, we introduce another flexible semi-supervised embedding method named Flexible Constrained Sparsity Preserving Embedding (FCSPE).

The framework of CSPE does not provide a straightforward solution to the out-of-sample problem. Indeed, the regression is carried out as an extra step. With the flexible method, FCSPE, both the non-linear mapping and the regression are

Performance evaluation

In this section, we evaluate the proposed methods on eight real image databases: Yale, ORL, FERET, PIE, Extended Yale B, LFW (the original data set and the aligned version), COIL-20, and USPS.

Conclusion and discussion

In this paper, two semi-supervised methods for data embedding are proposed. For semi-supervised data embedding, the proposed methods utilize the label information from the labeled data and the manifold regularization (derived from sparsity preserving criterion) on both labeled and unlabeled training data. The FCSPE method can generate a linear projection for unseen data points through a linear regression term in the optimal function.

The experimental results on eight real image databases clearly

Acknowledgments

This work is partially supported by National Natural Science Foundation of China under Grant Nos. 61373063, 61420201, 61472187, 61233011, 61375007, 61220301, and by National Basic Research Program of China under Grant No. 2014CB349303.

Libo Weng received his B.S. degree in Mathematics and Applied Mathematics from Nanjing University of Science and Technology, Nanjing, China, in 2011. He is currently pursuing the Ph.D. degree in Pattern Recognition and Intelligent Systems at Nanjing University of Science and Technology, Nanjing, China. He is also an international joint Ph.D. Student at the University of the Basque Country UPV/EHU, San Sebastian, Spain. His current research interests include pattern recognition and machine

References (45)

L. Qiao et al.
Sparsity preserving projections with applications to face recognition
Pattern Recognit.
(2010)
F. Nie et al.
Semi-supervised orthogonal discriminant analysis via label propagation
Pattern Recognit.
(2009)
S. Sun et al.
Manifold-preserving graph reduction for sparse semi-supervised learning
Neurocomputing
(2014)
Z. Li et al.
A locality-constrained and label embedding dictionary learning algorithm for image classification
IEEE Trans. Neural Netw. Learn. Syst. PP
(2015)
F. Pan et al.
Local margin based semi-supervised discriminant embedding for visual recognition
Neurocomputing
(2011)
F. Dornaika et al.
Graph-based semi-supervised learning with local binary patterns for holistic object categorization
Exp. Syst. Appl.
(2014)
C. Chen et al.
Constrained laplacian eigenmap for dimensionality reduction
Neurocomputing
(2010)
G. Yu et al.
Semi-supervised classification based on random subspace dimensionality reduction
Pattern Recognit.
(2012)
R.O. Duda et al.
Pattern Classification
(2012)
I. Borg et al.
Modern Multidimensional Scaling: Theory and Applications
(2005)

S.T. Roweis et al.

Nonlinear dimensionality reduction by locally linear embedding

Science

(2000)

J.B. Tenenbaum et al.

A global geometric framework for nonlinear dimensionality reduction

Science

(2000)

M. Belkin et al.

Laplacian eigenmaps for dimensionality reduction and data representation

Neural Comput.

(2003)

X. He et al.

Locality preserving projections

Neural Inf. Process. Syst.

(2003)

X. He, D. Cai, S. Yan, H.-J. Zhang, Neighborhood preserving embedding, in: Tenth IEEE International Conference on...

Z. Lai et al.

Approximate orthogonal sparse embedding for dimensionality reduction

IEEE Trans. Neural Netw. Learn. Syst.

(2016)

S. Yan, H. Wang, Semi-supervised learning by sparse representation, in: International Conference on Data Mining, SIAM,...

Z. Lai et al.

Sparse approximation to the eigensubspace for discrimination

IEEE Trans. Neural Netw. Learn. Syst.

(2012)

Z. Lai et al.

Sparse alignment for robust tensor learning

IEEE Trans. Neural Netw. Learn. Syst.

(2014)

Z. Lai et al.

Multilinear sparse principal component analysis

IEEE Trans. Neural Netw. Learn. Syst.

(2014)

D. Zhou, J. Huang, B. Schölkopf, Learning from labeled and unlabeled data on a directed graph, in: International...

X. Zhu, Semi-supervised learning, in: Encyclopedia of Machine Learning, Springer-Verlag, New York, 2010, pp....

Cited by (7)

Sparse graphs with smoothness constraints: Application to dimensionality reduction and semi-supervised classification
2019, Pattern Recognition
Citation Excerpt :
However, the main differences between the work of [31] and the current work are as follows. First, the work in [31] targets the construction of dense graphs that do not provide the same informative graphs as the sparse graphs that are addressed in the current work. This has the consequence that both the solution and the optimization problem are completely different.
Sparse representation is a useful tool in machine learning and pattern recognition area. Sparse graphs (graphs constructed using sparse representation of data) proved to be very informative graphs for many learning tasks such as label propagation, embedding, and clustering. It has been shown that constructing an informative graph is one of the most important steps since it significantly affects the final performance of the post graph-based learning algorithm. In this paper, we introduce a new sparse graph construction method that integrates manifold constraints on the unknown sparse codes as a graph regularizer. These constraints seem to be a natural regularizer that was discarded in existing state-of-the art graph construction methods. This regularizer imposes constraints on the graph coefficients in the same way a locality preserving constraint imposes on data projection in non-linear manifold learning. The proposed method is termed Sparse Graph with Laplacian Smoothness (SGLS). We also propose a kernelized version of the SGLS method. A series of experimental results on several public image datasets show that the proposed methods can out-perform many state-of-the-art methods for the tasks of label propagation, nonlinear and linear embedding.
Structured sparse graphs using manifold constraints for visual data analysis
2018, Neurocomputing
Citation Excerpt :
To the best of our knowledge, our previous work [37] is the first one that investigates the use of such manifold constraints on the rows or columns of the unknown affinity matrix associated with the graph. In this paper, we propose two main extensions to the proposed framework in [37]. Firstly, our current work introduces a criterion that provide sparse graphs; whereas the criterion of [37] provides non-sparse graphs.
Data-driven graphs constitute the cornerstone of many machine learning approaches. Recently, it was shown that sparse graphs (sparse representation based graphs) provide a powerful approach to graph-based semi-supervised classification. In this paper, we introduce a new structured sparse graph that is derived by integrating manifold-type constraints on the sparse coefficients without any a priori graph or similarity matrix. Furthermore, we introduce a direct and efficient solution to the proposed optimization problem. Unlike recent sparse graph construction methods that are based on the use of hand-crafted constraints or a predefined reference similarity matrix, our constraints are directly defined on the graph weights themselves, and can provide additional information to both local and global structures of the sparse graph. Experiments conducted on several image databases show that the proposed graph can give better results than many state-of-the-art sparse graphs when applied to the problem of graph-based label propagation.
Discriminative locally document embedding: Learning a smooth affine map by approximation of the probabilistic generative structure of subspace
2017, Knowledge-Based Systems
Citation Excerpt :
Another option is to build a parameterized mapping function to preserve the local invariant structure of the whole data distribution [26]. Recently, some manifold-inspired embedding methods are proposed based on this idea, such as [27], Graph regularized Auto-Encoders (GAEs) [28], Laplacian Auto-Encoders (LAEs) [29] and Flexible Constrained Sparsity Preserving Embedding (FCSPE) [30]. Among these methods, GAEs and LAEs both exploit the relation of nearby samples to regularize the training of AEs and can extract the embeddings of new coming samples with the encoder network directly.
Document embedding is a technology that captures informative representations from high-dimensional observations by some structure-preserving maps over corpus and has been intensively explored in machine learning. Recently, some manifold-inspired embedding methods become a hot topic, mainly due to their ability in capturing discriminative embedding. However, the existing methods capture the embeddings based on the geometrical information of nearest neighbors without considering the intrinsic documents-generating structure on a subspace, thus leads to a limitation to uncover intrinsic semantic information. In this paper, we propose a semi-supervised local-invariant method, called Discriminative Locally Document Embedding (Disc-LDE), aiming to build a smooth affine map for document embedding by preserving documents-generating structure on a subspace. Disc-LDE models the documents-generating structure as a pseudo-document by a generative probabilistic model of subspace, where the subspace is acquired by a transductive learning of multi-agent random walk on neighborhood graph, and regularizes the training of Auto-Encoders (AEs) to jointly recover the input document and its pseudo-document. Under a general regularized function learning framework, the regularized training can impact the parameterized encoder network become smooth to variations along the documents-generating structure of the local field on manifold. The experimental results on three widely-used corpora demonstrate Disc-LDE could efficient capture the intrinsic semantic structure to improve the clustering and classification performance to the state-of-the-arts methods.
Sparse graphs using global and local smoothness constraints
2020, International Journal of Machine Learning and Cybernetics
A data partition strategy for dimension reduction
2020, AIMS Mathematics
Generalized Regression Neural Network Optimized by Genetic Algorithm for Solving Out-of-Sample Extension Problem in Supervised Manifold Learning
2019, Neural Processing Letters

View all citing articles on Scopus

Fadi Dornaika received the M.S. degree in signal, image and speech processing from Grenoble Institute of Technology, France, in 1992, and the Ph.D. degree in computer science from Grenoble Institute of Technology, France, in 1995. He is currently a Research Professor at IKERBASQUE (Basque Foundation for Science) and the University of the Basque Country. Prior to joining IKERBASQUE, he held numerous research positions in Europe, China, and Canada. He has published more than 200 papers in the field of computer vision and pattern recognition. His current research interests include pattern recognition, machine learning and data mining.

Zhong Jin received the B.S. degree in mathematics, M.S. degree in applied mathematics and the Ph.D. degree in pattern recognition and intelligent system from Nanjing University of Science and Technology, Nanjing, China in 1982, 1984 and 1999, respectively. His current interests are in the areas of pattern recognition and face recognition.

View full text

Flexible constrained sparsity preserving embedding

Highlights

Abstract

Introduction

Section snippets

Related work

Proposed methods

Performance evaluation

Conclusion and discussion

Acknowledgments

Pattern Recognit.

Pattern Recognit.

Neurocomputing

IEEE Trans. Neural Netw. Learn. Syst. PP

Neurocomputing

Exp. Syst. Appl.

Neurocomputing

Pattern Recognit.

Pattern Classification

Modern Multidimensional Scaling: Theory and Applications

Nonlinear dimensionality reduction by locally linear embedding

Science

A global geometric framework for nonlinear dimensionality reduction

Science

Laplacian eigenmaps for dimensionality reduction and data representation

Neural Comput.

Locality preserving projections

Neural Inf. Process. Syst.

Approximate orthogonal sparse embedding for dimensionality reduction

IEEE Trans. Neural Netw. Learn. Syst.

Sparse approximation to the eigensubspace for discrimination

IEEE Trans. Neural Netw. Learn. Syst.

Sparse alignment for robust tensor learning

IEEE Trans. Neural Netw. Learn. Syst.

Multilinear sparse principal component analysis

IEEE Trans. Neural Netw. Learn. Syst.