Abstract
Linear discriminant analysis (LDA) is a popular technique that works for both dimensionality reduction and classification. However, LDA faces the problem of small sample size in dealing with high dimensional data. Several approaches have been proposed to overcome this issue, but the resulting transformation matrix fails to extract shared structures among data samples. In this paper, we propose trace norm regularized LDA that not only tackles the problem of small sample size but also uncover the underlying structures between target classes. Specifically, our formulation characterizes the intrinsic dimensionality of a transformation matrix owing to the appealing property of trace norm. Evaluations over nine real data sets deliver the effectiveness of our algorithm.
Similar content being viewed by others
References
Alon U, Barkai N, Notterman D, Gish K, Ybarra S, Mack D, Levine A (1999) Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proc Natl Acad Sci USA 96(12):6745–6750
Amit Y, Fink M, Srebro N, Ullman S (2007) Uncovering shared structures in multiclass classification. In: International conference on machine learning, vol 24
Argyriou A, Evgeniou T, Pontil M (2008) Convex multi-task feature learning. Mach Learn 73(3):243–272
Belhumeur P, Hespanha J, Kriegman D (1997) Eigenfaces vs fisherfaces: recognition using class specific linear projection. IEEE Trans Pattern Anal Mach Intell 19(7):711–720
Cai D, He X, Han J (2007) Spectral regression: a unified approach for sparse subspace learning. In: Proceedings of the international conference on data mining (ICDM’07)
Cai D, He X, Han JS (2008) An efficient algorithm for large-scale discriminant analysis. IEEE Trans Knowl Data Eng 20(1):1–12
Cai J, Candès E, Shen Z (2010) A singular value thresholding algorithm for matrix completion. SIAM J Control Optim 20(4):1956–1982
Candes EJ, Li X, Ma Y, Wright J (2009) Robust principal component analysis? arXiv:0912.3599
Candès EJ, Recht B (2009) Exact matrix completion via convex optimization. Found Comput Math 9(6):717–772
Chen J, Ye J, Li Q (2007) Integrating global and local structures: a least squares framework for dimensionality reduction. In: IEEE conference on computer vision and pattern recognition. IEEE, New York, pp 1–8
Chung F (1997) Spectral graph theory. American Mathematical Society, Reading
Damper RI, Gunn SR, Gore MO (2000) Extracting phonetic knowledge from learning systems: perceptrons, support vector machines and linear discriminants. Appl Intell 12(1–2):43–62
Duda R, Hart P, Stork D (2001) Pattern classification. Wiley, New York
Dudoit S, Fridlyand J, Speed T (2002) Comparison of discrimination methods for the classification of tumors using gene expression data. J Am Stat Assoc 97(457):77–87
Fazel M, Hindi H, Boyd SP (2001) A rank minimization heuristic with application to minimum order system approximation. In: proceedings of the American control conference, vol 6. IEEE, New York, pp 4734–4739
Friedman JH (1989) Regularized discriminant analysis. J Am Stat Assoc 84(405):165–175
Fukunaga K (1990) Introduction to statistical pattern recognition. Academic Press, New York
Harchaoui Z, Douze M, Paulin M, Dudik M, Malick J (2012) Large-scale image classification with trace-norm regularization. In: IEEE conference on computer vision and pattern recognition (CVPR). IEEE, New York, pp 3386–3393
Hastie T, Buja A, Tibshirani R (1995) Penalized discriminant analysis. Ann Stat 23(1):73–102
Hastie T, Tibshirani R, Friedman J (2001) The elements of statistical learning: data mining, inference, and prediction. Springer, New York
Howland P, Jeon M, Park H (2003) Structure preserving dimension reduction for clustered text data based on the generalized singular value decomposition. SIAM J Matrix Anal Appl 25(1):165–179
Howland P, Park H (2004) Generalizing discriminant analysis using the generalized singular value decomposition. IEEE Trans Pattern Anal Mach Intell, 995–1006
Hu Y, Zhang D, Ye J, Li X, He X (2013) Fast and accurate matrix completion via truncated nuclear norm regularization. IEEE Trans Pattern Anal Mach Intell 35(9):2117–2130. doi:10.1109/TPAMI.2012.271
Ji S, Tang L, Yu S, Ye J (2010) A shared-subspace learning framework for multi-label classification. ACM Trans Knowl Discov Data 4(2):8
Ji S, Ye J (2009) An accelerated gradient method for trace norm minimization. In: Proceedings of the 26th annual international conference on machine learning. ACM, New York, pp 457–464
Jin Z, Yang J, Hu Z, Lou Z (2001) Face recognition based on the uncorrelated discriminant transformation. Pattern Recognit 34(7):1405–1416
Jin Z, Yang J, Tang Z, Hu Z (2001) A theorem on the uncorrelated optimal discriminant vectors. Pattern Recognit 34(10):2041–2047
Lee S, Park YT, d’Auriol BJ et al (2012) A novel feature selection method based on normalized mutual information. Appl Intell 37(1):100–120
Liu Z, Pu J, Huang T, Qiu Y (2013) A novel classification method for palmprint recognition based on reconstruction error and normalized distance. Appl Intell, 1–8
Lofberg J (2004) Yalmip: a toolbox for modeling and optimization in Matlab. In: IEEE international symposium on computer aided control systems design. IEEE, New York, pp 284–289
Martinez A, Kak A (2001) PCA versus LDA. IEEE Trans Pattern Anal Mach Intell 23(2):228–233
Martinez AM (1998) The ar face database. CVC technical report 24
Park C, Park H (2005) A relationship between linear discriminant analysis and the generalized minimum squared error solution. SIAM J Matrix Anal Appl 27(2):474–492
Raudys S, Duin R (1998) Expected classification error of the Fisher linear classifier with pseudo-inverse covariance matrix. Pattern Recognit Lett 19(5):385–392
Sim T, Baker S, Bsat M (2001) The cmu pose, illumination, and expression (pie) database of human faces. Tech rep CMU-RI-TR-01-02, Robotics Institute, Pittsburgh, PA
Srebro N, Rennie JDM, Jaakkola TS (2005) Maximum-margin matrix factorization. In: Saul LK, Weiss Y, Bottou L (eds) Advances in neural information processing systems, vol 17. MIT Press, Cambridge, pp 1329–1336
Swets D, Weng J (1996) Using discriminant eigenfeatures for image retrieval. IEEE Trans Pattern Anal Mach Intell 18(8):831–836
Ye J, Li Q (2005) A two-stage linear discriminant analysis via qr-decomposition. IEEE Trans Pattern Anal Mach Intell 27(6):929–941
Ye J, Wang T (2006) Regularized discriminant analysis for high dimensional, low sample size data. In: Proceedings of the 12th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, New York, pp 454–463
Ye J, Yu B (2005) Characterization of a family of algorithms for generalized discriminant analysis on undersampled problems. J Mach Learn Res 6:483–502
Acknowledgements
The authors gratefully thank the anonymous referees for their critical comments. This work was supported in part by 863 Program of China under Grant 2008AA02Z310 and NSFC under Grant 60873133, together with Shanghai Committee of Science and Technology under Grants 08411951200 and 08JG05002.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Shu, X., Lu, H. Linear discriminant analysis with spectral regularization. Appl Intell 40, 724–731 (2014). https://doi.org/10.1007/s10489-013-0485-x
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-013-0485-x