ABSTRACT
Feature extraction is important in many applications, such as text and image retrieval, because of high dimensionality. Uncorrelated Linear Discriminant Analysis (ULDA) was recently proposed for feature extraction. The extracted features via ULDA were shown to be statistically uncorrelated, which is desirable for many applications. In this paper, we will first propose the ULDA/QR algorithm to simplify the previous implementation of ULDA. Then we propose the ULDA/GSVD algorithm, based on a novel optimization criterion, to address the singularity problem. It is applicable for undersampled problem, where the data dimension is much larger than the data size, such as text and image retrieval. The novel criterion used in ULDA/GSVD is the perturbed version of the one from ULDA/QR, while surprisingly, the solution to ULDA/GSVD is shown to be independent of the amount of perturbation applied. We did extensive experiments on text and face image data to show the effectiveness of ULDA/GSVD and compare with other popular feature extraction algorithms.
- Dudoit, S., Fridlyand, J., & Speed, T. P. (2002). Comparison of discrimination methods for the classification of tumors using gene expression data. Journal of the American Statistical Association, 97, 77--87.Google ScholarCross Ref
- Friedman, J. (1989). Regularized discriminant analysis. Journal of the American Statistical Association, 84, 165--175.Google ScholarCross Ref
- Fukunaga, K. (1990). Introduction to statistical pattern classification. San Diego, California, USA: Academic Press. Google ScholarDigital Library
- Golub, G. H., & Van Loan, C. F. (1996). Matrix computations. Baltimore, MD, USA: The Johns Hopkins University Press. Third edition.Google Scholar
- Howland, P., Jeon, M., & Park, H. (2003). Structure preserving dimension reduction for clustered text data based on the generalized singular value decomposition. SIAM Journal on Matrix Analysis and Applications, 25, 165--179. Google ScholarDigital Library
- Jin, Z., Yang, J. Y., Hu, Z.-S., & Lou, Z. (2001a). Face recognition based on the uncorrelated discriminant transformation. Pattern Recognition, 34, 1405--1416.Google ScholarCross Ref
- Jin, Z., Yang, J.-Y., Tang, Z.-M., & Hu, Z.-S. (2001b). A theorem on the uncorrelated optimal discriminant vectors. Pattern Recognition, 34, 2041--2047.Google ScholarCross Ref
- Jolliffe, I. T. (1986). Principal component analysis. New York: Springer-Verlag.Google Scholar
- Krzanowski, W., Jonathan, P., McCarthy, W., & Thomas, M. (1995). Discriminant analysis with singular covariance matrices: methods and applications to spectroscopic data. Applied Statistics, 44, 101--115.Google ScholarCross Ref
- Paige, C., & Saunders, M. (1981). Towards a generalized singular value decomposition. SIAM Journal on Numerical Analysis, 18, 398--405.Google ScholarDigital Library
- Park, H., Jeon, M., & Rosen, J. (2003). Lower dimensional representation of text data based on centroids and least squares. BIT, 43, 1--22.Google ScholarCross Ref
- Porter, M. (1980). An algorithm for suffix stripping program. Program, 14, 130--137.Google ScholarCross Ref
- Swets, D. L., & Weng, J. (1996). Using discriminant eigenfeatures for image retrieval. IEEE Trans. Pattern Analysis and Machine Intelligence, 18, 831--836. Google ScholarDigital Library
Recommendations
Feature Reduction via Generalized Uncorrelated Linear Discriminant Analysis
High-dimensional data appear in many applications of data mining, machine learning, and bioinformatics. Feature reduction is commonly applied as a preprocessing step to overcome the curse of dimensionality. Uncorrelated Linear Discriminant Analysis (...
Uncorrelated slow feature discriminant analysis using globality preserving projections for feature extraction
Slow Feature Discriminant Analysis (SFDA) is a supervised feature extraction method for classification inspired by biological mechanism. However, SFDA only considers the local geometrical structure information of data and ignores the global geometrical ...
A comparison of generalized linear discriminant analysis algorithms
Linear discriminant analysis (LDA) is a dimension reduction method which finds an optimal linear transformation that maximizes the class separability. However, in undersampled problems where the number of data samples is smaller than the dimension of ...
Comments