Elsevier

Neurocomputing

Volume 171, 1 January 2016, Pages 1629-1636
Neurocomputing

Brief Papers
Recursive Dimension Reduction for semisupervised learning

https://doi.org/10.1016/j.neucom.2015.06.062Get rights and content

Abstract

Semisupervised Dimension Reduction (SDR) Using Trace Ratio Criterion (TR-FSDA) is an effective iterative SDR algorithm, which introduces a flexible regularization term FXTW2 to relax such a hard linear constraint in SDA that the low-dimensional representation F is constrained to lie in the linear subspace spanned by the data matrix X. We, however, observe that TR-FSDA may take some meaningless features in the iteration and cannot be always guaranteed to converge. In this paper, we propose a novel method for SDR, referred to as Recursive Dimension Reduction for Semisupervised Learning (RDS). Instead of solving the non-trivial TR problem using the iterative algorithm of TR-FSDA, we solve the objective function of TR-FSDA using a newly-developed recursive procedure. In each iteration, only a projection vector and a one-dimensional data representation are produced by solving a standard Rayleigh Quotient problem. Our algorithm escapes from the convergence guarantee, since it directly solves the objective and requires no any iterative strategy in finding each of the projection vectors. The experiments on four face databases, one object database, one shape image database, and one Handwritten Digit database demonstrate the effectiveness of RDS.

Introduction

DIMENSION reduction (DR) has attracted much attention in pattern recognition, computer vision, etc., since it can effectively address the so-called “curse of dimensionality” problem. Two of the most well-known DR algorithms are Principal Component Analysis (PCA) [1] and Fisher Linear Discriminant (FLD) [2]. In addition, by considering different aspects of DR, different algorithms have been developed [1], [2], [3], [4], [5], [6], [7], [8], [40], [41], [42], [43], e.g., to yield nonnegative projection [40], [41] by using nonnegative matrix factorization and to improve discrimination by using the parallel vector field embedding algorithm [42], [43].

PCA and FLD are two linear subspace learning algorithms, which, however, cannot discover the essential data structures that are nonlinear. Recently, a number of manifold-based learning techniques, e.g., Isometric Feature Mapping (ISOMAP) [4], Local Linear Embedding (LLE) [5], and Laplacian Eigenmap (LE) [6] are developed to resolve this problem. The central idea of manifold learning is to find an intrinsic low-dimensional embedding of data. However, these classical approaches cannot map new coming samples. In order to solve this problem, He et al., proposed Locality Preserving Projections (LPP) [7]. Yang et al. [8] proposed a “classification-oriented” technique, called Unsupervised Discriminant Projection (UDP). Yan et al. [9] recently proposed a general formulation known as graph embedding providing a unified formulation of a broad set of DR techniques.

For supervised learning, we may require collecting a large number of data points. Directly labeling these data is not only time-consuming but also expensive. Furthermore, supervised learning algorithms may not obtain the promising results when label information is not sufficient. Therefore, semisupervised learning plays an important role to solve these problems. In the literature, many semisupervised learning algorithms for classification have been developed, e.g., Transductive SVM (TSVM) [10], [11], and graph-based semi-supervised learning algorithms [12], [13], [14], [15]. Among them, GFHF [12] and LGC [13] are two label propagation approaches designed for predicting the labels of unlabeled data in the training set, which, however, cannot deal with new coming data. The linear LapRLS [14] can be viewed as an “out-of-sample” extension of LGC/GFHF. The recent years have witnessed numerous researching activities on semisupervised DR [16], [17], [18], [19], [20] for different tasks. However, these algorithms suffer from such a constraint that the low-dimensional data representation F is constrained to lie within the linear space spanned by all the training samples. To relax this hard linear constraint, Nie et al. developed Flexible Manifold Embedding (FME) [21], which is a multidimensional extension of LapRLS. In the research [22], Nie et al. proposed Semisupervised Dimensionality Reduction via Virtual Label Regression (VLR), which can be viewed as a formulation of two-step FME with outlier detection.

Considering that in Semisupervised Discriminant Analysis (SDA) [19], [20], the manifold smoothness term is introduced in the objective function of FLD and the low-dimensional data representation F is constrained to be in the space spanned by all the training samples, Semisupervised Dimension Reduction Using Trace Ratio Criterion (TR-FSDA) is proposed to relax this constraint by modeling the mismatch between F and h(X)=XTW [23]. To solve the resulted non-trivial Trace Ratio (TR) optimization problem, an iterative algorithm is designed to simultaneously find F and W. TR-FSDA has demonstrated the promising results for different recognition tasks.

In this paper, we target to solve the problem for semisupervised DR using a well-designed recursive procedure. Our algorithm is based on TR-FSDA. Despite the exhibited performance advantage, TR-FSDA suffers from two restrictions. Firstly, the meaningful discriminant projection vectors may not be correctly found. TR-FSDA adopts an iterative algorithm to address the non-trivial TR optimization problem. At each iteration, it solves a problem similar to Maximum Margin Criterion (MMC) [24] which optimizes the problem maxW,WTW=Itr(WTAWWTBW), where tr() denotes the trace, and A and B represent graph relationships that represent different types of information of data [9], respectively. As claimed in [24], the most meaningful discriminant projection vectors should be selected as the eigenvectors corresponding to the eigen values more than or equal to zeros of matrix AB. TR-FSDA does not take into account this problem and arbitrarily chooses the number of the discriminant projection axes at each iteration, such that the used meaningless discriminant projection vectors from the previous iterations may affect the generation of optimal discriminant projection vectors in the following iterations. Second, the convergence cannot be guranteed. At the tth iteration, TR-FSDA requires the calculation of matrix Zt=(λt+λtL¯aL¯b)1, where λt is the TR value calculated from the projection matrix of the previous step, and the definitions on L¯a and L¯b can be found in [23]. Zt must be positive to ensure the convergence of TR-FSDA, which is not always true in real applications and depends on the values of the parameters involved in TR-FSDA. In order to solve this problem, Huang et al., [23] proposed to ignore the parameter combination that causes the non-convergence of TR-FSDA. However, doing so is very absurd, since the parameters are used to balance different terms to improve the recognition results. Thus, TR-FSDA may achieve undesired results. To address these problems, a new recursive procedure is designed to calculate F and W. At each iteration, our approach is to solve a Rayleigh Quotient rather than non-trivial TR optimal problem and thus has no problem existed in TR-FSDA.

Section snippets

TR-FSDA revisited

Given a data matrix X=[x1,x2,...,xl,xl+1,xl+2,...,xn]Rd×n, the first l points xi ( il ) are labeled and the remaining u points are unlabeled. The label for the labeled point xi is defined as yi={1,2,...,C} in which C denotes the number of classes. We also define a linear regression function h(X)=XTW, where WRd×r is the projection matrix and r is the dimensionality of lower-dimensional subspace. We construct the following graph using the popular method: if xi is in the k -neighbors of xj or xj

RDS

In this section, we develop a new approach, called Recursive Dimension Reduction for Semisupervised Learning (RDS), which uses a novel recursive procedure to extract the discriminant projection vectors.

Experiments

We evaluate our algorithm on four face databases UMIST [31], ORL [32], YALE [33], and FERET [34], a shape image database MPEG-7 [35], an object database COIL20 [36], and a Handwritten Digit (HD) database [37]. In [22], the authors have inferred that COIL20 and UMIST have a clear manifold. Table 1 describes the details for each database used in the experiments. For UMIST, MPEG-7, COIL20, and HD databases, 50% of samples per class and the remaining samples are randomly selected as the training

Conclusion

The primary goal of TR-FSDA is to better cope with the data sampled from a certain type of nonlinear manifold that is somewhat close to a linear subspace by relaxing the hard linear constraint F=XTW in SDA. TR-FSDA adopts an iterative algorithm to solve the optimization problem. However, the matrix Zt at each iteration requires being of positive-definiteness to ensure the convergence of TR-FSDA, which is not true in real applications. Furthermore, the difference formulation in TR-FSDA may lead

Acknowledgment

The authors are extremely thankful to Scientific Research Foundation for Advanced Talents and Returned Overseas Scholars of Nanjing Forestry University (163070679), Natural Science Foundation of the Jiangsu Higher Education Institutions of China (14KJB520018), Natural Science Foundation of Jiangsu Province of China (BK2012399), Practice Innovation Training Program Projects for Jiangsu College Students, and Natural Science Foundations of China (61101197, 61401214, and 61402192) for support.

Qiaolin Ye received the BS degree in Computer Science from Nanjing Institute of Technology, Nanjing, China, in 2007, the MS degree in Computer Science and Technology from Nanjing Forestry University, Jiangsu, China, in 2009, and the Ph.D. degree in Pattern Recognition and Intelligence System from Nanjing University of Science and Technology, Jiangsu, China, in 2013.

He is currently an associate professor with the computer science department at the Nanjing Forestry University, Nanjing, China. He

References (43)

  • C. Xiang et al.

    Face recognition using recursive Fisher linear discriminant

    IEEE Trans. Image Process

    (2006)
  • J. Yang et al.

    Globally maximizing, locally minimizing: unsupervised discriminant projection with applications to face and palm biometrics

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2007)
  • S. Yan et al.

    Graph embedding and extensions: a general framework for dimensionality reduction

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2007)
  • V. Vapnik

    Statistical Learning Theory

    (1998)
  • R. Collobert et al.

    Large scale transductive SVMs

    J. Mach. Learn. Res.

    (2006)
  • X. Zhu et al.

    Semi-supervised learning using Gaussian fields and harmonic functions

    Proc. ICML

    (2003)
  • D. Zhou et al.

    Learning with local and global consistency

    Proc. NIPS

    (2004)
  • M. Belkin et al.

    Manifold regularization: a geometric framework for learning from examples

    J. Mach. Learn. Res.

    (2006)
  • V. Sindhwani et al.

    Beyond the point cloud: from transductive to semi-supervised learning

    Proc. Int. Conf. Mach. Learn.

    (2005)
  • S.M. Xiang et al.

    Nonlinear dimensionality reduction with local spline embedding

    IEEE Trans. Knowl. Data Eng.

    (2009)
  • D. Cai et al.

    Semi-supervised discriminant analysis

    Proc. ICCV

    (2007)
  • Cited by (1)

    Qiaolin Ye received the BS degree in Computer Science from Nanjing Institute of Technology, Nanjing, China, in 2007, the MS degree in Computer Science and Technology from Nanjing Forestry University, Jiangsu, China, in 2009, and the Ph.D. degree in Pattern Recognition and Intelligence System from Nanjing University of Science and Technology, Jiangsu, China, in 2013.

    He is currently an associate professor with the computer science department at the Nanjing Forestry University, Nanjing, China. He has authored more than 30 scientific papers in pattern recognition, machine learning, and data mining. His research interests include machine learning, data mining, and pattern recognition.

    T.M. Yin received the B.S. degree in forestry and the Ph.D degree in genetics and molecular biology from Nanjing Forestry University, Jiangsu, China. Dr. Yin’s main research interests focus on genomics, gene function, and molecular breeding of woody plants. His representative achievements include: (1) contribution towards construction of the genetic platforms for tree genomic studies, (2) mapping and cloning of genes underlying important traits in woody plants, (3) development of genetic tools and marker resources for applicability of the sequenced poplar genome to studies of alternate poplar genotypes and species and (4) discovery on the genetic mechanism triggering the evolution process from hermephordites to diecious plants and genomic proofs for parapatric speciation.

    In 2011, Dr. Yin won the Outstanding Young Scientist Fund of Natural Science Fund of China. In 2010, Dr. Yin was nominated as one of the top ten outstanding young scientists in Jiangsu province of China. In 2008, he was the awardee of the Cheung Kong Scholars Program of China. The other honors recognized for Dr. Yin include distinguished contributor and awardee for Science and Technology Development at Oak Ridge National Lab, Department of Energy, U.S.A.; awardee of New Century Excellent Talents Program of China; awardee of Jubilee Award issued by International Fund of Sweden. Contributing editor for book Tree Genetics and Breeding, which won the national second prize for Excellent Scientific and Technical Books. Dr. Yin is an active reviewer for some famous international journals, such as Genome Research, New Phytologist, Molecular Breeding etc. He also serves as the academic editor for PLosOne.

    Shangbing Gao received the BS degree in mathematics from the Northwestern Polytechnical University in 2003. He received the MS degree in applied mathematics from the Nanjing University of Information and Science and Technology in 2006. He is now working at Huaiyin institute of technology as an assistant lecturer. He is currently pursuing the Ph.D. degree with School of Computer Science and Technology, Nanjing University of Science and Technology (NUST). He is on the subject of pattern recognition and intelligence systems. His current research interests include pattern recognition and computer vision.

    View full text