Rapid and brief communicationUncorrelated heteroscedastic LDA based on the weighted pairwise Chernoff criterion
Introduction
In statistical pattern recognition, linear dimensionality reduction (LDR) techniques are widely applied to reduce the complexity of the statistical model and often result in the improved classification accuracy in the transformed lower-dimensional space. Fisher's linear discriminant analysis (LDA) is one of the most popular supervized linear dimensionality reduction techniques, which tries to find an optimal set of discriminant vectors by maximizing the Fisher criterion: . Here, and are the between-class scatter matrix and average within-class scatter matrix of the training sample group, respectively, which can be estimated as follows:where and represent the total number of pattern classes, a priori probability of pattern class , the mean vector of class , the mean vector of all training samples and the covariance matrix of class , respectively. The between-class scatter matrix can be expressed by both the original definition and its equivalent pairwise decomposition form [1].
Uncorrelated features are usually desirable in pattern recognition tasks because an uncorrelated feature set is likely to contain more discriminatory information than a correlated one of the same dimension. Recently, Jin et al. [2] proposed the uncorrelated LDA technique (ULDA), which can obtain discriminant vectors by maximizing the Fisher criterion under the constraints that the extracted feature components are statistically uncorrelated, i.e. the derived discriminant vectors are subject to the -orthogonal constraints: . Yang et al. [4] also demonstrated that ideal discriminant vectors should not only correspond to maximal Fisher criterion values but also correspond to minimal correlations between the extracted feature components. Therefore, the ULDA can yield a set of discriminant vectors with better discriminating power as shown experimentally in Refs. [2], [4].
However, the ULDA technique still suffers from some deficiencies: firstly, it is incapable of dealing with heteroscedastic data in a proper way due to the implicit assumption that the covariance matrices for all the classes are equal. Hence, the derived discriminant vectors by the ULDA can merely attempt to separate the class means as much as possible while ignoring the discriminatory information present in the differences between the per class covariance matrices. This fact leads to the upper bound of the number of discriminant vectors extracted by ULDA to be limited to as proven in Ref. [2]. Secondly, from the equivalent pairwise decomposition expression of the matrix, we can easily find that the class pair with large distance between them in the original feature space are overemphasized in the pairwise formula, which results in the obtained transformation attempting to preserve the distances of already well separated classes while causing larger overlap between pairs of classes that are not well separated in the original feature space. Consequently, the discriminant directions that may well separate the neighboring classes in the original feature space cannot be obtained by the ULDA if there are some classes far away and well separated from some other classes. In this paper, we propose an uncorrelated heteroscedastic LDA (UHLDA) technique based on the weighted pairwise Chernoff criterion, which can successfully solve the above problems.
Section snippets
ULDA technique
Suppose that and U are positive semi-definite matrices and is a positive definite matrix. The first ULDA discriminant vector, denoted by , is calculated as the eigenvector corresponding to the maximal eigenvalue of the eigenequation . Suppose that eigenvectors , have been obtained. The th ULDA discriminant vector , which maximizes the Fisher criterion function with -orthogonal constraints, is the eigenvector corresponding to the maximum
Experimental results
We test the performance of our UHLDA technique on 4 data sets from the UCI repository: Ionosphere, Sonar, Ecoli1 and Pendigits. Compared with the ULDA solution, the UHLDA demonstrates its superiority by extracting more discriminatory features, thereby improving the final classification results.
We employ the Bayesian linear discriminant classifier (LDC)
References (5)
- et al.
Face recognition based on the uncorrelated discriminant transformation
Pattern Recognition
(2001) - et al.
A theorem on the uncorrelated optimal discriminant vectors
Pattern Recognition
(2001)
Cited by (35)
Uncorrelated multi-set feature learning for color face recognition
2016, Pattern RecognitionComplete large margin linear discriminant analysis using mathematical programming approach
2013, Pattern RecognitionCitation Excerpt :Then these two kinds of discriminant vectors are used jointly for better feature extraction. To tackle the class separation problem, an intuitive idea is to incorporate a weighting function into Fisher criterion to ensure that the neighboring class pairs in the original sample space have higher weights since they are more likely to be misclassified in the projected space [5,6,15,16]. Among these weighting-based methods, Loog et al. [5] presented a simple but effective criterion named approximate pairwise accuracy criterion (aPAC) that adds weights which are used to approximate the Bayes error for class pairs in the estimation of Sb.
Supervised immune clonal evolutionary classification algorithm for high-dimensional data
2012, NeurocomputingCitation Excerpt :Liang et al. [9] presented another uncorrelated linear discriminant analysis method which is based on weighted pairwise. Qin et al. [10] put forward an uncorrelated heteroscedastic LDA (UHLDA) technique based on Chernoff criterion. Ye et al. [11] presented an uncorrelated linear discriminant approach (ULDA) for gene expression data.
Exploring the boundary region of tolerance rough sets for feature selection
2009, Pattern RecognitionCitation Excerpt :Also, as RST operates only on the data and does not require any thresholding information, it is completely data-driven. Other useful approaches may also be employed for dimensionality reduction and FS such as Refs. [2,5,7,11], unlike RST, however, these approaches require additional information or transform the data. The main disadvantage of RST is its inability to deal with real-valued data.
Linear feature extraction by integrating pairwise and global discriminatory information via sequential forward floating selection and kernel QR factorization with column pivoting
2008, Pattern RecognitionCitation Excerpt :This extension of the Fisher criterion can capture the discriminatory information in the covariance matrices. In addition, Qin et al. [6] introduced a weighting function to generalized Loog and Duins’ approach. Li et al. [7] proposed a new criterion known as the maximum margin criterion.
A study on three linear discriminant analysis based methods in small sample size problem
2008, Pattern Recognition