Nonparametric discriminant multi-manifold learning for dimensionality reduction

doi:10.1016/j.neucom.2014.11.012

Neurocomputing

Volume 152, 25 March 2015, Pages 121-126

https://doi.org/10.1016/j.neucom.2014.11.012 Get rights and content

Abstract

Based on that data sampled from the same class locate on one manifold and those labeled different classes reside on the corresponding manifolds, traditional data classification problem can be reasoned to multiply manifolds identification. Thus in this paper, a dimensionality reduction method titled nonparametric discirminant multi-manifold learning (NDML) is put forward and involved in different manifolds recognition. In the proposed method, a novel nonparametric manifold-to-manifold distance is defined to characterize the separability between manifolds. And then an objective function is modeled to project the original data into a low dimensional space, where the manifold-to-manifold distances can be maximized and manifolds locality will be preserved. Experiments have been carried out on benchmark face data sets with comparisons to some related dimensionality reduction methods such as Unsupervised Discriminant Projection (UDP), Constrained Maximum Variance Mapping (CMVM) and Linear Discriminant Analysis (LDA). The experimental results validate that the proposed NDML can obtain better performance than other methods.

Introduction

During last decade, dimensionality reduction has been attracting considerable attention in many fields as pattern recognition, data mining, computer vision and machine learning. As a fundamental problem in these fields, dimensionality reduction plays an important role in data analysis with the goal to find a meaningful low dimensional representation of high dimensional data. In regard to pattern recognition, dimensionality reduction is an effective and feasible way to overcome the “curse of dimensionality”. Moreover, since there are large volumes of high dimensional data in numerous real world applications, some of which are perhaps superfluous, so extracting the most useful features from the real world data using dimensionality reduction techniques not only helps to explore the essential structure of the original data, but also contributes to accomplish the task of classification at low computational cost.

Dimensionality reduction methods can be categorized into linear models and nonlinear ones. Many linear methods such as Principal Component Analysis (PCA), Linear Discriminant Analysis (LDA) and Independent Component Analysis (ICA) have been widely used on practical applications with good performance [34]. However, they also expose their limitations when applied to nonlinear distributed data. So in the last few years, some approaches including manifold learning have been developed for nonlinear dimensionality reduction. Among all the manifold learning methods, Isometric Mapping (ISOMAP) [1], Locally Linear Embedding (LLE) [2], [3], Laplacian Eigenmaps (LE) [4], Local Tangent Space Alignment (LTSA) [5], Maximum Variance Unfolding (MVU) [6] and Riemannian Manifold Learning (RML) [7] are their representatives. It was shown by many examples that these methods have yielded impressive results on artificial and real world data sets. Compared to other nonlinear dimensionality reduction methods, manifold learning shows its superiorities in the following. On the one hand, manifold learning can explore the essential dimensions of manifold data embedded in high dimensional space. For instance, in Ref. [1], [2], the left–right pose, up–down pose and lighting direction can be found to three essential dimensions of face manifold. On the other hand, manifold learning pursuits to embed the original data with high dimensions in a lower dimensional space by locality preserving, where the locality can be approached using k Nearest Neighbors (KNN) criterion. Thus manifold learning can be efficiently applied for data visualization. The reason is concluded that both manifold learning and data visualization have the similar goal to project the original data into a 2-D or 3-D space, where their intrinsic structure information will be preserved as much as possible.

However, when employed to data classification, manifold learning appears some shortcomings. Firstly, manifold learning yields embeddings directly based on training data sets. Owing to the implicitness of nonlinear mapping, for any new test sample, the original manifold learning methods cannot easily obtain its projection in the embedding space by utilizing low-dimensional embeddings of training sets, which greatly confines applications of the original manifold learning algorithms to pattern classification. In order to overcome this out-of-sample problem [8], linearization, kernelization, tensorization and other tricks were proposed, which were validated efficient to find the low dimensional embeddings of test data on the basis of the mapping results of training samples [9], [10].

Secondly, it is unproblematic for the existing manifold learning algorithms as they try to approach a simple manifold. However, if there are many manifolds, how to clearly identify different manifolds still needs further demonstrations. For example, if face images sampled from many persons do exist in a high dimensional space, then different persons׳ face images should lie on the corresponding manifolds. So it is necessary to distinguish individual face images from different manifolds. In order to achieve optimal recognition results, the recovered embeddings associated to different manifolds should be as separable as possible in the final embedding space, which poses a problem that might be called “classification-oriented multi-manifold learning” [11]. The problem cannot be solved by some current manifold learning algorithms and their supervised versions [12], [13], [14], [15], [16], [17], [18], [19], [20] because they all just concentrate on the characterization of “locality” and do not take into account the variances among manifolds. Recently, researchers put forward some discriminant multi-manifold learning methods, where supervised graph was constructed to approach manifolds distances. Lai proposed a soft margin scatter and a soft within-class scatter, both of which were introduced to find an optimal subspace for data classification [21]. Wang presented a maximum inter-class and marginal discriminant embedding for manifolds identification [22]. On the basis of Unsupervised Discriminant Projection (UDP) [11], a locally statistical uncorrelation was advanced by Chen as a constraint [23]. Later, Lu defined an inter-manifold graph and an intra-manifold graph according to class information and then the corresponding graph Laplacian spectrum was used to search the optimal projection [24]. Similar to Lu, Chen adopted the least reconstruction trick to both inter-class graph and intra-class graph to seek a discirminant subspace [25]. All the methods mentioned above characterized the separability of manifolds globally and failed to take advantage of the local distances between manifolds.

So in this paper, a dimensionality reduction method, named Nonparametric Discirminant Multi-manifold Learning (NDML), is presented and involved in multiply manifolds identification. In the proposed NDML algorithm, a novel nonparametric manifold-to-manifold distance is defined to model separability between manifolds, where both labels and local structure information of manifolds are all considered. Moreover, the linearization trick is also introduced to avoid out-of-sample problem. At last, an objective function is constructed to explore an optimal subspace with the maximum manifold-to-manifold distances and the minimum manifolds locality.

The rest of the paper is organized as follows: Section 2 simply reviews LLE and Constrained Maximum Variance Mapping (CMVM) [26]. In Section 3, the principle of NDML is addressed in details. Experimental results on AR face data, ORL face data and YaleB face data are offered in Section 4 and the paper is finished with some conclusions in Section 5.

Section snippets

Reviews of LLE and CMVM

There are many dimensionality reduction methods related to the proposed NDML such as LLE and CMVM, which will be briefly reviewed in the following.

Nonparametric discirminant multi-manifold learning

Accompanying with the classical manifold learning methods, more and more supervised extensions are booming for dimensionality reduction. Some take advantage of class information to adjust neighborhood weights in KNN graph; others are integrated to LDA [27] for discriminant dimensionality reduction. However, most of these supervised versions pay more attention to constructions of local graph instead of local distance between manifolds, which can be introduced to measure the separability of

Experiments

In this section, experiments will be conducted on some benchmark data sets including AR face data, ORL face data and YaleB face data, where UDP, CMVM, LDA and the proposed NDML algorithm are all employed to reduce dimensionality of the original face data. Moreover, in the low dimensional subspace, the Nearest Neighbor (NN) classifier is also adopted, by which the labels of those test data will be predicted.

However, when carrying out experiments, KNN is applied to approach the locality in UDP,

Conclusion

In this paper, a nonparametric discriminant multi-manifold learning method is proposed for dimensionality reduction. In the proposed method, manifolds distance is locally or nonparametric defined which can be modeled to distance between any point and mean of its inter-class k nearest neighbors. Moreover, an objective function is constructed to explore the low dimensional subspace with the maximum manifolds distance and the minimum locality, which is validated either by the theoretical analysis

Acknowledgment

This work was partly supported by the Grants of the National Natural Science Foundation of China (61273303, 61273225, 61373109 and 61472280), the China Postdoctoral Science Foundation (20100470613 and 201104173), the Natural Science Foundation of Hubei Province (2010CDB03302), the Research Foundation of Education Bureau of Hubei Province (Q20121115), the Program of Wuhan Subject Chief Scientist (201150530152), the Hong Kong Scholars Program (XJ2012012) and the Open Project Program of the

Bo Li received his M.Sc. and Ph.D. degree in Mechanical and Electronic Engineering from Wuhan University of Technology in 2003, Pattern Recognition and Intelligent System from University of Science and Technology of China in 2008, respectively. Now, he is an associated professor at School of Computer Science and Technology, Wuhan University of Science and Technology. He is also a research associated in Ryerson University. His research interests include machine learning, pattern recognition,

References (34)

L. Zhao et al.
Supervised locally linear embedding with probability-based distance for classification
Comput. Math. Appl.
(2009)
B. Li et al.
Locally linear discriminant embedding: an efficient method for face recognition
Pattern Recognit.
(2008)
Y. Chen et al.
Discriminant subspace learning constrained by locally statistical uncorrelation for face recognition
Neural Netw.
(2013)
Y. Chen et al.
Recontructive discriminant analysis: a feature extraction method induced from linear regression classification
Neurocomputing
(2012)
J.B. Tenenbaum et al.
A global geometric framework for nonlinear dimensionality reduction
Science
(2000)
S.T. Roweis et al.
Nonlinear dimensionality reduction by locally linear embedding
Science
(2000)
L.K. Saul et al.
Think globally, fit locally: unsupervised learning of low dimensional manifolds
J. Mach. Learn. Res.
(2003)
M. Belkin et al.
Laplacian eigenmaps for dimensionality reduction and data representation
Neural Comput.
(2003)
Z. Zhang et al.
Principal manifolds and nonlinear dimensionality reduction by local tangent space alignment
SIAM J. Sci. Comput.
(2004)
K.Q. Weinberger et al.
Unsupervised learning of image manifolds by semi-definite programming
Int. J. Comput. Vis.
(2006)

T. Lin et al.

Riemannian manifold learning

IEEE Trans. Pattern Anal. Mach. Intell.

(2008)

Y. Bengio et al.

Out-of-sample extensions for LLE, Isomap, MDS, eigenmaps, and spectral clustering

Technical Report 1238

(2003)

M. Brand, Charting a manifold, in: Proceedings of the 15th Conference Neural Information Processing Systems,...

S. Yan et al.

Graph embedding: a general framework for dimensionality reduction

IEEE Trans. Pattern Anal. Mach. Intell.

(2007)

J. Yang et al.

Globally maximizing, locally minimizing: unsupervised discriminant projection with application to face and palm biometrics

IEEE Trans. Pattern Anal. Mach. Intell.

(2007)

Q. Zhao, D. Zhang, H. Lu, Supervised LLE in ICA space for facial expression recognition, in: Proceedings of the...

P.Y. Han, A.T.J. Beng, W.E. Kiong, Neighborhood discriminant locally linear embedding in face recognition, in:...

Cited by (30)

Polygonal Coordinate System: Visualizing high-dimensional data using geometric DR, and a deterministic version of t-SNE
2021, Expert Systems with Applications
Citation Excerpt :
KPCA is a nonlinear method that transforms a nonlinear problem into a linear problem in some embedded feature space. Nonlinear DR algorithms include, in addition to KPCA, various manifold learning techniques (Li, Li, & Zhang, 2015; Paul & Chalup, 2017; Holiday et al., 2019). Without the need for external information (e.g., class labels), manifold learning is a nonlinear approach to DR that considers the generalization of nonlinear data structures.
Dimensionality Reduction (DR) is useful to understand high-dimensional data. It attracts wide attention from industry and academia and is employed in areas such as machine learning, data mining, and pattern recognition. This work presents a geometric approach to DR termed Polygonal Coordinate System (PCS), capable of representing multidimensional data in two or three dimensions while preserving their inherent overall structure by taking advantage of a polygonal interface bridging high- and low-dimensional spaces. PCS can handle Big Data by adopting an incremental, geometric DR with linear-time complexity. A new version of t-Distributed Stochastic Neighbor Embedding (t-SNE), a state-of-the-art algorithm for DR, is also provided. It employs a PCS-based deterministic strategy and is named t-Distributed Deterministic Neighbor Embedding (t-DNE). Several synthetic and real data sets were used as well-known real-world problem archetypes in our benchmark, providing a means to evaluate PCS and t-DNE against four embedding-based DR algorithms: two linear-transformation ones (Principal Component Analysis and Non-negative Matrix Factorization) and two nonlinear ones (t-SNE and Sammon’s Mapping). Statistical comparisons of the execution times of these algorithms, by the Friedman’s significance test, highlight the efficiency of PCS in data embedding. PCS tends to surpass its counterparts in several aspects explored in this work, including asymptotic time and space complexity, preservation of global data-inherent structures, number of hyperparameters, and applicability to unobserved data.
A survey of CAPTCHA technologies to distinguish between human and computer
2020, Neurocomputing
CAPTCHA, Completely Automated Public Turing test to tell Computers and Humans Apart, is widely used as a security mechanism to classify human and computer. This security mechanism is based on the Turing Test, which has been conceived to ensure network security. Usability is another fundamental issue, which can avoid human users proceeding tedious and time-consuming operation. CAPTCHA design should consider security and usability simultaneously. This paper provides a review on the development of CAPTCHA technologies for human and computer classification, along with their applications and instantiations. Different from previous CAPTCHA survey, this review discusses the CAPTCHA mechanism from usability and security aspects, therefore attacking (anti-classification) and defending (classification) technologies towards current CAPTCHA are both reviewed. Besides, recent emerging CAPTCHA and the attacking techniques are also introduced in this paper, such as game CAPTCHA, deep learning-based attacking, and etc.
A pseudo-dynamic search ant colony optimization algorithm with improved negative feedback mechanism
2020, Cognitive Systems Research
To solve low convergence precision and slow convergence speed, a pseudo-dynamic search ant colony optimization algorithm with improved negative feedback mechanism (PACON) is proposed. Firstly, the algorithm introduces an angle in the pheromone transfer rule. Through the rule for calculating the angle, multiple cities with smaller angles are also included in the next candidate city list. It affects the probability of city selection and enhances the algorithm’ performance to avoid local optimization. Secondly, the algorithm updates the pheromone concentrations on the worst and optimal path simultaneously, and enhances the weights of the pheromone concentrations on the optimal path. It improves the convergence speed of the algorithm. Based on experiments adopting TSPLIB data sets, the results demonstrate the improved algorithm improves the convergence accuracy by at least 1.26% and increases the convergence speed by at least 9.5%, both on large-scale and small-scale urban data. The novel algorithm will improve convergence precision and speed better.
A survey on Laplacian eigenmaps based manifold learning methods
2019, Neurocomputing
As a well-known nonlinear dimensionality reduction method, Laplacian Eigenmaps (LE) aims to find low dimensional representations of the original high dimensional data by preserving the local geometry between them. LE has attracted great attentions because of its capability of offering useful results on a broader range of manifolds. However, when applying it to some real-world data, several limitations have been exposed such as uneven data sampling, out-of-sample problem, small sample size, discriminant feature extraction and selection, etc. In order to overcome these problems, a large number of extensions to LE have been made. So in this paper, we make a systematical survey on these extended versions of LE. Firstly, we divide these LE based dimensionality reduction approaches into several subtypes according to different motivations to address the issues existed in the original LE. Then we successively discuss them from strategies, advantages or disadvantages to performance evaluations. At last, the future works are also suggested after some conclusions are drawn.
Semi-supervised metric learning in stratified spaces via intergrating local constraints and information-theoretic non-local constraints
2018, Neurocomputing
Citation Excerpt :
Both groups of metric learning methods enforce some false prior knowledge which is not satisfied in the intersecting points of manifolds, and therefore, the existing metric learning methods are not suitable for the stratified space. Many studies have shown that data could be better modeled by the assumption that lies on the stratified space [1,16,19,23,28,52]. The stratified space is a mixture of manifolds representing different characteristics and complexities of the data set.
Considerable research efforts have been done in learning semi-supervised distance metric learning based on both manifold and cluster assumptions in the past few years. However, there is a major problem with them once they are applied to data lying on stratified space. The problem is that label smoothness assumption on manifold and cluster may be violated in the intersecting regions of manifolds. This problem is caused by overlearning of locality that misleads the metric learning process in the absence of enough labeled data.
In this paper, we will propose a novel semi-supervised metric learning for stratified spaces (S2MLS2) which removes unsuitable local constraints in the manifold based methods for adapting to the smoothness assumption on multi manifolds. We will also impose some non-local constraints to detect the shared structures at different positions in the absence of enough supervised information. Besides, a novel bootstrapping method based on smoothness assumption on multi manifolds will be proposed to enlarge the labeled data.
The proposed algorithm is based on different behavior of Laplacian of piecewise-smooth function on multi manifolds in the neighborhood of non-interior points of the manifolds as compared with interior points of the manifolds. Experiments on artificial and real benchmark data sets demonstrate that the proposed metric learning method outperforms many state-of-the-art metric learning methods.
Robust locally linear embedding algorithm for machinery fault diagnosis
2018, Neurocomputing
Citation Excerpt :
Recently, many researchers have devoted much time to extracting the important feature of a data set by manifold learning algorithms. Manifold learning algorithms [5–8] can project an original data set into a low-dimensional feature space, where the types of each data can be easily recognized [9]. In general, manifold learning algorithms can be roughly divided into two categories: linear methods [10,11] and non-linear methods [12,13].
Locally linear embedding (LLE) is a classical nonlinear dimensionality reduction algorithm, and it has been widely used in machinery fault diagnosis. LLE reduces the dimensions of a data set only by exploring the geometry structure, that is, the geometry structure is one of the key factors for the embedding result. In conventional LLE algorithm, the geometry structure is calculated by ordinary least square (OLS) algorithm, which makes the embedding result be sensitive to noise. In order to resolve the problem, a robust LLE (RLLE) is investigated. In RLLE algorithm, the Least Angle Regression and the Elastic Net (LARS-EN) technologies are employed to compute the local structure. Besides, a novel fault diagnosis method based on RLLE and support vector machine (SVM) are proposed for machinery fault diagnosis. Experiments performed on both synthetic and real data sets demonstrate the advantages of the proposed method in the term of fault diagnosis.

View all citing articles on Scopus

Jun Li received his M.Sc. in Computer Science from Wuhan University of Technology in 2003. Now, he is an associated professor at School of Computer Science and Technology, Wuhan University of Science and Technology. Moreover, he is also a doctoral candidate in School of Computer, Wuhan University. His research interests include machine learning, pattern recognition, image processing and intelligent computing.

Xiao-Ping Zhang (M׳97, SM׳02) received B.S. and Ph.D. degrees from Tsinghua University, in 1992 and 1996, respectively, both in Electronic Engineering. He holds an MBA in Finance, Economics and Entrepreneurship with Honors from the University of Chicago Booth School Of Business, Chicago, IL. He is a professor in Ryerson University. He is currently an Associate Editor for IEEE Transactions on Signal Processing, IEEE Transactions on Multimedia, IEEE Signal Processing letters and for Journal of Multimedia. His research interests include intelligent computing, bioinformatics, multimedia communication and signal processing.

View full text

Brief PapersNonparametric discriminant multi-manifold learning for dimensionality reduction

Abstract

Introduction

Section snippets

Reviews of LLE and CMVM

Nonparametric discirminant multi-manifold learning

Experiments

Conclusion

Acknowledgment

Comput. Math. Appl.

Pattern Recognit.

Neural Netw.

Neurocomputing

A global geometric framework for nonlinear dimensionality reduction

Science

Nonlinear dimensionality reduction by locally linear embedding

Science

Think globally, fit locally: unsupervised learning of low dimensional manifolds

J. Mach. Learn. Res.

Laplacian eigenmaps for dimensionality reduction and data representation

Neural Comput.

Principal manifolds and nonlinear dimensionality reduction by local tangent space alignment

SIAM J. Sci. Comput.

Unsupervised learning of image manifolds by semi-definite programming

Int. J. Comput. Vis.

Riemannian manifold learning

IEEE Trans. Pattern Anal. Mach. Intell.

Out-of-sample extensions for LLE, Isomap, MDS, eigenmaps, and spectral clustering

Technical Report 1238

Graph embedding: a general framework for dimensionality reduction

IEEE Trans. Pattern Anal. Mach. Intell.

Globally maximizing, locally minimizing: unsupervised discriminant projection with application to face and palm biometrics

IEEE Trans. Pattern Anal. Mach. Intell.

Brief Papers
Nonparametric discriminant multi-manifold learning for dimensionality reduction