Elsevier

Signal Processing

Volume 139, October 2017, Pages 182-189
Signal Processing

Short communication
Enhanced regularized least square based discriminative projections for feature extraction

https://doi.org/10.1016/j.sigpro.2017.04.018Get rights and content

Abstract

The regularized least square based discriminative projections (RLSDP) for extracting features was recently proposed, which aims to seek discriminant projection directions that maximize the between-class scatter and minimize the within-class compactness. However, in RLSDP, the retrieval samples are reconstructed by the coefficients only associated with the same class, and may have large errors. Moreover, the distances between each sample and other within-class samples characterize the most important within-class compactness information, and are not minimized in RLSDP. To deal with the above two problems, we propose an enhanced regularized least square based discriminative projections (ERLSDP). ERLSDP utilizes all the related coefficients of each sample for reconstruction and explicitly minimizes the distances between all the within-class samples, and thus it has better reconstruction accuracy and more discriminating power than that of RLSDP. Experimental results demonstrate that ERLSDP gets a clear improvement over RLSDP when the training sample size is small.

Introduction

Feature extraction, which aims to produce compact and effective low-dimensional feature representations of high-dimensional data, has been extensively studied over the past several decades. Compared with the global based principal component analysis (PCA) [1] and linear discriminant analysis (LDA) [2] approaches, manifold learning methods are more appealing since they can discover the local intrinsic structure of data. Representative manifold learning methods include locality preserving projections (LPP) [3], locality preserving discriminant projections (LPDP) [4], discriminative locality alignment (DLA) [5], discriminant locality preserving projections (DLPP) [6], marginal Fisher analysis (MFA) [7], etc. Although their motivations are different, they all can be unified in the graph embedding (GE) framework [7], and their differences lie in graph construction. Manifold learning has found its wide applications in various fields. For example, Li et al. [8] developed a discriminative distance metric learning (DML) algorithm based on manifold learning, and further derived a distributed and parallel computational scheme to deal with the large-scale metric learning problem. Reference [9] exploited the manifold learning method to analyze multivariate variable-length sequence data. Gao et al. [10] integrated local and global manifold structures for face and image classification.

Recently, sparse representation has shown its promising performance in many domains [11], [12], [13], [14], [15]. For instance, Wright et al. [11] proposed a sparse representation based classification (SRC) for face recognition. Zhou et al. [12] proposed a double shrinking algorithm (DSA) for sparse projection eigenvectors. Moreover, many research efforts [16], [17] had shown that the neighborhood relationship of each data could be adaptively obtained by sparse representation methods, and the resulted ℓ1-graph was robust to noise. Based on ℓ1-graph, Qiao et al. [16] proposed a sparsity preserving projections (SPP) for feature extraction, which aims at preserving the sparse reconstruction relationship of the data both in original space and low-dimensional embedding space. By combining the supervised SPP and maximum margin criterion, Gui et al. [18] introduced a discriminant sparse neighborhood preserving embedding (DSNPE) algorithm. Gao et al. [10] gave a discriminative sparsity preserving projections (DSPP), which first employs sparse representation to build an intrinsic graph and a penalty graph, and then integrates global within-class structure for dimensionality reduction. Despite their good performance, sparse representation methods need to solve ℓ1 norm minimization problem, which has higher computational complexity.

Zhang et al. [19], [20] claimed that the collaborative representation mechanism was the key factor for the success of SRC, and proposed a collaborative representation based classification (CRC) method. CRC replaces the ℓ1 norm in SRC with simpler ℓ2 norm. CRC has similar properties and competitive classification performance to SRC. Based on CRC, Yang et al. [21] constructed a ℓ2-graph and developed a collaborative representation based projections (CRP) to preserve the collaborative reconstruction relationship of the data. Yin et al. [22] proposed a collaborative representation reconstruction based projections (CRRP). The projection matrix in CRRP is obtained by maximizing the collaborative reconstruction between-class scatter and minimizing the collaborative reconstruction within-class scatter. Another method was proposed in [23], which is similar to [22]. In [24], Yang et al. developed a regularized least square based discriminative projections (RLSDP). It maximizes the between-class scatter adopted by LDA and minimizes the within-class compactness by the reconstruction residual from the same class. However, RLSDP includes two main problems. First, the reconstructions by the coefficients only corresponding to the same class will have large errors, and thus RLSDP cannot give the best reconstruction for each sample. Second, it does not minimize the distances between each sample and other within-class samples, which is important for minimizing the within-class compactness.

To address the above two problems, we propose an enhanced regularized least square based discriminative projections (ERLSDP). In ERLSDP, each sample is reconstructed by all the associated coefficients, which results in smaller reconstruction error. More importantly, the distances between each sample and all its reconstructed within-class samples, which characterize the most important within-class compactness, are minimized. The optimal discriminant projection of ERLSDP is achieved by maximizing the between-class scatter and minimizing the within-class compactness simultaneously. Experiments on three face databases indicate that our ERLSDP performs better than RLSDP.

The main contributions of our work are as follows. (1) We make use of the whole representation coefficients to reconstruct each sample. In contrast, the original RLSDP only utilizes the partial representation coefficients corresponding to the same class for reconstruction of each sample. Thus, our ERLSDP achieves smaller reconstruction error and better classification performance. (2) We build a weight matrix to explicitly characterize the within-class geometry of the data, and minimize the distances between all the within-class samples. Meanwhile, by maximizing the between-class scatter, the samples sharing the same class label will be pulled together and those from different classes will be pushed apart, which is a very desirable property for classification tasks.

The rest of this paper is structured as follows. In Section 2, the regularized least square (RLS) and RLSDP are briefly reviewed. The proposed ERLSDP is detailed in Section 3. The experimental results are illustrated in Section 4, and the conclusions are given in Section 5.

Section snippets

RLS and RLSDP

Given a set of n training samples X=[x1,x2,,xn]Rm×n with C classes, where xiRm is the ith sample. Based on the class labels, X can also be portioned as X=[X1,X2,,XC], where Xc=[x1c,x2c,,xncc]Rm×nc contains the samples associated with class c, xjc denotes the jth sample in the cth class and nc is the number of samples in class c.

Motivations

It is seen from Eq. (2) that to minimize the within-class compactness, RLSDP only minimizes the reconstruction error of each sample xi and its reconstructed form by the coefficients si+, where the element of si+ is defined in Eq. (4). There are mainly two problems. First, using si+ to reconstruct xi will have larger error since these non-zero values in si+ only associated with the same class as xi. Second, RLSDP neglects the within-class geometry which is very important for characterizing the

Experimental results

To show the effectiveness of ERLSDP, we compare it with CRP [21], LDA [2], MFA [7], DSNPE [18], DLPP [6], CRRP [23] and RLSDP [24] on three face databases, namely ORL [25], AR [26] and FERET [27]. For MFA, we empirically set the neighbor parameters k1 as ni1 and select k2 from {1C, 3C, 5C, 7C, 9C}, where ni and C are the number of training samples in class i and the number of classes, respectively. The public available solver ℓ1-magic (http://users.ece.gatech.edu/∼justin/l1magic/) is used for

Conclusions and future work

In this paper, we propose an ERLSDP for feature extraction. Compared with the original RLSDP, ERLSDP utilizes all the corresponding coefficients of each sample so that it achieves better reconstruction accuracy. In addition, ERLSDP also explicitly minimizes the distances between all the within-class samples at the same time. Therefore, it is able to make the within-class samples more compact that is desirable for classification. Experimental results on three face databases validate its

Acknowledgment

This work was supported by the National Natural Science Foundation of China under Grant 61271293.

References (38)

  • W. Yang et al.

    A regularized least square based discriminative projections for feature extraction

    Neurocomputing

    (2016)
  • P.J. Phillips et al.

    The FERET database and evaluation procedure for face-recognition algorithms

    Image Vis. Comput.

    (1998)
  • G.-F. Lu et al.

    L1-norm and maximum margin criterion based discriminant locality preserving projections via trace Lasso

    Pattern Recognit

    (2016)
  • C.-X. Ren et al.

    Robust classification using ℓ2, 1-norm based regression model

    Pattern Recognit

    (2012)
  • M. Turk et al.

    Eigenfaces for recognition

    J. Cogn. Neurosci.

    (1991)
  • P.N. Belhumeur et al.

    Eigenfaces vs. fisherfaces: recognition using class specific linear projection

    IEEE Trans. Pattern Anal. Mach. Intell.

    (1997)
  • X. He et al.

    Face recognition using Laplacianfaces

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2005)
  • T. Zhang et al.

    Patch alignment for dimensionality reduction

    IEEE Trans. Knowl. Data Eng.

    (2009)
  • S. Yan et al.

    Graph embedding and extensions: a general framework for dimensionality reduction

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2007)
  • Cited by (6)

    • Dimensionality reduction by collaborative preserving Fisher discriminant analysis

      2019, Neurocomputing
      Citation Excerpt :

      RLSDP obtains a discriminant subspace by maximizing the between-class scatter in LDA and minimizing the collaborative reconstruction error from the same class simultaneously. Nevertheless, RLSDP fails to minimize the distances between the samples with the same class label which characterizes the most important compactness information [52], and it also has the same limitation as LDA that the number of available projection axes is less than the number of classes. In addition, joint discriminative dimensionality reduction and dictionary learning (JDDRDL) method [53] coupled the discriminative DR and dictionary learning into a unified energy minimization framework, which further enhances the representation accuracy and discriminant ability of CRC.

    • Jointly discriminative projection and dictionary learning for domain adaptive collaborative representation-based classification

      2019, Pattern Recognition
      Citation Excerpt :

      First, we verify the effectiveness of proposed optimization method in JD2-CRC. Some state-of-the-art subspace learning algorithms, including SRC-DP [20], OP-CRC [24], LDA [15], PCA [14], CRP [21], KCRP [23], ERLSDP [42] and CRLDP [22], are used to compared on several face image datasets (AR face image dataset [53], the extend Yale B face image dataset [54] and CMU PIE face image dataset [55]) and a digit image dataset (MNIST digit dataset [56]). In these experiments, we only consider the standard machine learning problem, i.e., the testing data have the same distribution with the training data.

    • An Optimized Residual Network with Block-soft Clustering for Road Extraction from Remote Sensing Imagery

      2019, Proceedings of 2019 IEEE 4th Advanced Information Technology, Electronic and Automation Control Conference, IAEAC 2019
    View full text