Abstract
Fisher criterion has achieved great success in dimensionality reduction. Two representative methods based on Fisher criterion are Fisher Score and Linear Discriminant Analysis (LDA). The former is developed for feature selection while the latter is designed for subspace learning. In the past decade, these two approaches are often studied independently. In this paper, based on the observation that Fisher score and LDA are complementary, we propose to integrate Fisher score and LDA in a unified framework, namely Linear Discriminant Dimensionality Reduction (LDDR). We aim at finding a subset of features, based on which the learnt linear transformation via LDA maximizes the Fisher criterion. LDDR inherits the advantages of Fisher score and LDA and is able to do feature selection and subspace learning simultaneously. Both Fisher score and LDA can be seen as the special cases of the proposed method. The resultant optimization problem is a mixed integer programming, which is difficult to solve. It is relaxed into a L 2,1-norm constrained least square problem and solved by accelerated proximal gradient descent algorithm. Experiments on benchmark face recognition data sets illustrate that the proposed method outperforms the state of the art methods arguably.
Chapter PDF
Similar content being viewed by others
Keywords
- Feature Selection
- Linear Discriminant Analysis
- Face Image
- Feature Selection Method
- Locality Preserve Projection
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Argyriou, A., Evgeniou, T., Pontil, M.: Convex multi-task feature learning. Machine Learning 73(3), 243–272 (2008)
Belhumeur, P.N., Hespanha, J.P., Kriegman, D.J.: Eigenfaces vs. fisherfaces: Recognition using class specific linear projection. IEEE Trans. Pattern Anal. Mach. Intell. 19(7), 711–720 (1997)
Boyd, S., Vandenberghe, L.: Convex optimization. Cambridge University Press, Cambridge (2004)
Cai, D., He, X., Han, J.: Semi-supervised discriminant analysis. In: ICCV, pp. 1–7 (2007)
Cai, D., He, X., Han, J.: Spectral regression: A unified approach for sparse subspace learning. In: ICDM, pp. 73–82 (2007)
Fukunaga, K.: Introduction to statistical pattern recognition, 2nd edn. Academic Press Professional, Inc., San Diego (1990)
Golub, G.H., Loan, C.F.V.: Matrix computations, 3rd edn. Johns Hopkins University Press, Baltimore (1996)
Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. Journal of Machine Learning Research 3, 1157–1182 (2003)
Hastie, T., Tibshirani, R., Friedman, J.H.: The Elements of Statistical Learning. Springer, Heidelberg (2001)
He, X., Cai, D., Niyogi, P.: Laplacian score for feature selection. In: NIPS (2005)
He, X., Cai, D., Yan, S., Zhang, H.: Neighborhood preserving embedding. In: ICCV, pp. 1208–1213 (2005)
He, X., Niyogi, P.: Locality preserving projections. In: NIPS (2003)
Ji, S., Ye, J.: An accelerated gradient method for trace norm minimization. In: ICML, p. 58 (2009)
Liu, J., Ji, S., Ye, J.: Multi-task feature learning via efficient l 2,1-norm minimization. In: Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence, UAI 2009 (2009)
Luo, D., Ding, C.H.Q., Huang, H.: Towards structural sparsity: An explicit l2/l0 approach. In: ICDM, pp. 344–353 (2010)
Masaeli, M., Fung, G., Dy, J.G.: From transformation-based dimensionality reduction to feature selection. In: ICML, pp. 751–758 (2010)
Moghaddam, B., Weiss, Y., Avidan, S.: Generalized spectral bounds for sparse lda. In: ICML, pp. 641–648 (2006)
Nesterov, Y.: Smooth minimization of non-smooth functions. Math. Program. 103(1), 127–152 (2005)
Nesterov, Y.: Introductory Lectures on Convex Optimization: A Basic Course. Kluwer Academic Publishers, Dordrecht (2003)
Nie, F., Xiang, S., Jia, Y., Zhang, C., Yan, S.: Trace ratio criterion for feature selection. In: AAAI, pp. 671–676 (2008)
Obozinski, G., Taskar, B., Jordan, M.I.: Joint covariate selection and joint subspace selection for multiple classification problems. Statistics and Computing 20, 231–252 (2010)
Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification. Wiley-Interscience Publication, Hoboken (2001)
Song, L., Smola, A.J., Gretton, A., Borgwardt, K.M., Bedo, J.: Supervised feature selection via dependence estimation. In: ICML, pp. 823–830 (2007)
Sugiyama, M.: Local fisher discriminant analysis for supervised dimensionality reduction. In: ICML, pp. 905–912 (2006)
Yan, S., Xu, D., Zhang, B., Zhang, H., Yang, Q., Lin, S.: Graph embedding and extensions: A general framework for dimensionality reduction. IEEE Trans. Pattern Anal. Mach. Intell. 29(1), 40–51 (2007)
Ye, J.: Characterization of a family of algorithms for generalized discriminant analysis on undersampled problems. Journal of Machine Learning Research 6, 483–502 (2005)
Ye, J.: Least squares linear discriminant analysis. In: ICML, pp. 1087–1093 (2007)
Yuan, M., Yuan, M., Lin, Y., Lin, Y.: Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society, Series B 68, 49–67 (2006)
Zhao, Z., Wang, L., Liu, H.: Efficient spectral feature selection with minimum redundancy. In: AAAI (2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Gu, Q., Li, Z., Han, J. (2011). Linear Discriminant Dimensionality Reduction. In: Gunopulos, D., Hofmann, T., Malerba, D., Vazirgiannis, M. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2011. Lecture Notes in Computer Science(), vol 6911. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23780-5_45
Download citation
DOI: https://doi.org/10.1007/978-3-642-23780-5_45
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-23779-9
Online ISBN: 978-3-642-23780-5
eBook Packages: Computer ScienceComputer Science (R0)