Elsevier

Pattern Recognition

Volume 37, Issue 5, May 2004, Pages 1077-1079
Pattern Recognition

Rapid and Brief Communication
An efficient algorithm to solve the small sample size problem for LDA

https://doi.org/10.1016/j.patcog.2003.02.001Get rights and content

Abstract

In this paper, we present an efficient algorithm to solve the most discriminant vectors of LDA for high-dimensional data set. The experiments on ORL face database confirm the effectiveness of the proposed method.

Introduction

Linear discriminant analysis (LDA) is a very important feature extraction method in pattern recognition. However, one of the difficulties one may encounter in the use of LDA is the so-called “small sample size” problem [1]. In 2000, Chen et al. [1] proposed a new algorithm to calculate the most discriminant vector of LDA. They proved that the null space of the within-class scatter matrix SW contains the most discriminative information, and therefore they resolve the optimal discriminant vectors from the null space of SW. In 2001, Yu et al. [2] proposed another method to do the same work. Different from the method in literature [1], they resolve the discriminant vectors in the complement of the null space of SB. This is because they think that the effective discriminant vectors are in this subspace. However, our further studies have shown that it is not generally true because many most discriminant vectors lie in neither the null space of SB nor its complementary space. Thus, it may not find the most discriminant vectors from the compliment of the null space of SB.

Although Chen et al. give an exact resolution precedure for LDA, it still faces the computation difficulty when the dimensionality of the samples is too large. Shu et al. [5] proposed another algorithm under the statistically uncorrelated condition [4]. However, this method will suffer from the instability problem according to the degenerate perturbation theory. Besides, according to the physical meaning of LDA, the solutions solved by using Shu et al.'s algorithm cannot guarantee to be the most discriminant vectors. In this paper, we present a robust and efficient algorithm to overcome the above problem.

Section snippets

Efficient algorithm for LDA in high-dimensional space

First of all, we introduce two theorems that are related to our work. Theorem 1, Theorem 2 can be seen as an extension of the fourth theorem in literature [3] and the third theorem in literature [4], respectively. The detail proofs of the two theorems are omitted in this paper. In what follows, let ST represent the total-class scatter matrix of the training samples. Let ST(0) be the null space of ST and ST(0) be the orthogonal complement of ST(0).

Theorem 1

Suppose that ST is singular. Then the

Experiments

We use the Olivetti Research Lab (ORL) face database to test the proposed algorithm. There are 10 different images of 40 distinct persons in this database. The original face images were all sized 112×92 with a 256-level gray scale. Before conducting the experiment, the gray scale was linearly normalized to lie within [−1,1]. The training set and query set are partitioned in the following ways: we randomly choose five images per person for training and the other five for testing.

We extract the

References (5)

There are more references available in the full text version of this article.

Cited by (56)

  • A survey on Laplacian eigenmaps based manifold learning methods

    2019, Neurocomputing
    Citation Excerpt :

    The problem occurs to LDA when data number is smaller than data dimensions, which results in the irreversible within-class scatter. In order to overcome the problem, some techniques have been proposed [117–121]. Among them, a straight-forward method is to change the ration form of the original LDA into a difference form, which is defined to margin between various classes.

  • MBLDA: A novel multiple between-class linear discriminant analysis

    2016, Information Sciences
    Citation Excerpt :

    Finally, some conclusions are given. In some special cases such as undersampled problem or small sample size problem [7,14,54], Sw may become to singular and its inverse is not well defined [9,22]. In order to overcome this problem, several effective methods were proposed such as PCA+LDA [3] and RLDA [52].

  • Complete large margin linear discriminant analysis using mathematical programming approach

    2013, Pattern Recognition
    Citation Excerpt :

    Zheng et al. [11] presented a similar approach which further incorporates statistically uncorrelated constraint on the discriminant vector into NLDA such that the extracted features are uncorrelated [12]. However, these null space-based methods [9–12] only consider discriminative information from the null space of Sw, whereas discarding that contained in its orthogonal complement space. To make full use of the information residing in both subspaces, Yang et al. [13,14] proposed a complete LDA (CLDA) framework which extracts the irregular discriminant vectors from the null space of Sw as well as regular discriminant vectors from its orthogonal complementary space, respectively.

  • An efficient 3D face recognition approach based on the fusion of novel local low-level features

    2013, Pattern Recognition
    Citation Excerpt :

    In order to solve the non-linearly separable problem, the “kernel trick” is introduced to map the data from the original feature space into a high dimensional space, in which the mapped data can be expected to become more discriminative and separable by a hyper-plane. In order to demonstrate the superiority of the non-linear SVM, a comparison between a modified LDA algorithm [39,40] (the modified LDA algorithm resolves the singularity problem caused by small sample-sized training data [41,42]) and three SVM-based algorithms (Linear-SVM, Polynomial-SVM and RBF-SVM) is presented in Table 2. All experiments in this section are performed according to the same experimental setups of Section 4.3.

  • A novel maximum margin neighborhood preserving embedding for face recognition

    2012, Future Generation Computer Systems
    Citation Excerpt :

    LDA takes consideration of the labels of the input data and improves the recognition ability. However, LDA suffers from the small sample size (SSS) problem and many effective methods [4–6] have been explored to solve the problem. As the world is not always flat, linear DR cannot adequately explore the nonlinear structure of real data [7].

View all citing articles on Scopus
View full text