Abstract
Over the last few years, a lot of algorithms for discriminant analysis (DA) have been developed. Although having different motivations, they all inject structure information in data into their own within- and between-class scatters. However, to our best knowledge, there has not been yet a systematical examination about (1) which structure granularities lurk in data; (2) which structure granularities are utilized in scatters of a DA algorithm; (3) whether new DA algorithms can be developed based on existing structure granularities. In this paper, the established so-called structurally motivated (SM) framework for DA and its unified mathematical formulation of the ratio trace exactly answers them. It categorizes these DA algorithms from the viewpoint of constructing scatters based on different-granularity structures in data, identifies their applicable scenarios for different structure types, and provides insights into developing new DA algorithms. Inspired by the insight, we find that cluster granularity lying in the middle of granularity spectrum in SM framework can still be further utilized and exploited. As a result, the three DA algorithms based on the cluster granularity are derived from the SM framework and from the injection of the cluster structure information into the respective within-class, between-class and joint both scatter matrices for the classical MDA, and these corresponding algorithms are, respectively, called as SWDA, SBDA and SWBDA. The injection of cluster structure information makes the proposed three algorithms able to fit relatively complicated data not only more effectively, but also with the regularization technique obtain more projections than the classical MDA, which is very helpful for more effective DA. Moreover, MDA becomes their special case when the cluster numbers of all classes are set to 1. Our experiments on the benchmarks (face and UCI databases) here show that the proposed algorithms yield encouraging results.
Similar content being viewed by others
Notes
SWDA, SBDA and SWBDA are the three algorithms are subsequently proposed in this paper.
References
Duda RO, Hart PE, Stork DG (2001) Pattern Classification, 2nd edn. Wiley, New York
Yan S, Xu D, Zhang B, Zhang H, Yang Q, Lin S (2006) Graph embedding and extension: a general framework for dimensionality reduction. IEEE Trans Pattern Anal Mach Intell 29(1):40–51
Cai D, He XF, Kun Z, Han JW, Bao HJ (2007) Locality sensitive discriminant analysis. In: Proceedings of the international joint conference on artificial intelligence, Hyderabad, India, pp 141–146
Manli Z, Martinez AM (2006) Subclass discriminant analysis. IEEE Trans Pattern Anal Mach Intell 28(8):1274–1286
Fukunaga K (1990) Introduction to statistical pattern recognition, 2nd edn. Academic Press, Boston
Hastie T, Tibshirani R (1996) Discriminant analysis by gaussian mixture. J Roy Stat Soc B 58(1):155–176
Ye J, Janardan R, Li Q (2005) Two-dimensional linear discriminant analysis. In: Proceedings of advances in neural information processing systems, vol 17, pp 1569–1576
Hand D (1982) Kernel discriminant analysis. Research Studies Press, Chichester
Harrison RF, Pasupa K (2010) A simple iterative algorithm for parsimonious binary kernel Fisher discrimination. Pattern Anal Appl Theor Adv 13(1):15–22
Shen C, Li H, Brooks MJ (2008) Supervised dimensionality reduction via sequential semidefinite programming. Pattern Recogn 41(12):3644–3652
Yeung DS, Wang D, Wing WYN, Tsang ECC, Zhao X (2007) Structured large margin machines: sensitive to data distributions. Mach Learn 68(2):171–200
Belkin M, Niyogi P (2002) Using manifold structure for partially labeled classification. In: Proceedings of advances in neural information processing systems. http://www.cse.ohio-state.edu/~mbelkin/papers/UMS_NIPS_02.pdf
Vapnik VN (1999) The nature of statistical learning theory. Springer, New York
Burges C (1998) A tutorial on support vector machines for pattern recognition. Data Min Knowl Disc 2(2):121–167
Shivaswamy PK, Jebara T (2007) Ellipsoidal kernel machines. In: Proceedings of the 11th artificial intelligence and statistics. San Juan, Puerto Rico, pp 481–489
Huang K, Yang H, King I, Lyu MR (2004) Learning large margin classifiers locally and globally. In: Proceedings of the 21st international conference on machine learning. Banff, Canada, pp 51–59
Bishop CM (2006) Pattern recognition and machine learning. Springer, Berlin
Xue H, Chen SC, Yang Q (2008) Structural support vector machine. In: Proceedings of the 15th international symposium on neural networks, LNCS5263, ISNN (1), pp 501–511
Boyd S, Vandenberghe L (2003) Convex optimization. Cambridge University Press, Cambridge
Olvera-López J, Carrasco-Ochoa J, Martínez-Trinidad J (2010) A new fast prototype selection method based on clustering. Pattern Anal Appl Theor Adv 13(2):131–141
Zhu X (2008) Semi-supervised learning literature survey. Computer sciences technical report 1530, University of Wisconsin-Madison
Rao CR (2002) Linear statistical inference and its applications, 2nd edn. Wiley, New York
Brand M (2003) Continuous nonlinear dimensionality reduction by kernel eigenmaps. Technical report 2003–21, Mitsubishi Electric Research Labs
Song Y, Nie F, Zhang C, Xiang S (2008) A unified framework for semi-supervised dimensionality reduction. Pattern Recogn 41(9):2789–2799
Turk M, Pentland A (1991) Face recognition using eigenfaces. In: Proceedings of the computer vision and pattern recognition, pp 586–591
Tenenbaum J, Silva V, Langford J (2000) A global geometric framework for nonlinear dimensionality reduction. Science 290(5500):2319–2323
Roweis S, Saul L (2000) Nonlinear discriminant analysis by locally linear embedding. Science 290(5500):2323–2326
Belkin M, Niyogi P (2003) Laplacian eigenmaps for dimensionality redaction and data representation. Neural Comput 15(6):1373–1396
He XF, Niyogi P (2003) Locality preserving projections. In: Proceedings of advances in neural information processing systems, vol 16. Vancouver, Canada, pp 153–160
Li H, Jiang T, Zhang K (2006) Efficient and robust feature extraction by maximum margin criterion. IEEE Trans Neural Netw 17(1):157–165
Li Y, Guan C (2007) Joint feature re-extraction and classification using an iterative semi-supervised SVM algorithm. Mach Learn 71(1):33–53
Bengio Y, Delalleau O, Roux NL (2006) The Curse of highly variable functions for local kernel machines. In: Proceedings of advances in neural information processing systems, vol 18, pp 107–114
Blake CL, Merz CJ (1998) UCI repository of machine learning databases. University of California, Irvine, Department of Information and Computer Sciences
Salvador S, Chan P (2004) Determining the number of clusters/segments in hierarchical clustering/segmentation algorithms. In: Proceedings of the 16th IEEE international conference on tools with AI, pp 576–584
Ji S, Ye J (2008) A unified framework for generalized linear discriminant analysis. In: Proceedings of IEEE computer society conference on computer vision and pattern recognition, pp 1–7. doi:10.1109/CVPR.2008.4587377
Cai D,He X, Han J (2007) Semi-supervised discriminant analysis. In: Proceedings of IEEE international conference on computer vision, Rio de Janeiro, Brazil. doi:10.1109/ICCV.2007.4408856
Cai D, He X, Han J (2007) Spectral regression for efficient regularized subspace learning. In: Proceedings of IEEE international conference on computer vision, Rio de Janeiro, Brazil. doi:10.1109/ICCV.2007.4408855
Liu J, Chen SC, Tan XY (2008) Fractional order singular value decomposition representation for face recognition. Pattern Recogn 41(1):378–395
Yang T, Kecman V (2010) Face recognition with adaptive local hyperplane algorithm. Pattern Anal Appl Theor Adv 13(1):79–83
Chougdali K, Jedra M, Zahid N (2010) Kernel relevance weighted discriminant analysis for face recognition. Pattern Anal Appl Theor Adv 13(2):213–221
Martinez AM, Kak AC (2001) PCA versus LDA. IEEE Trans Pattern Anal Mach Intell 23(2):228–233
Beyer K, Goldstein J, Ramakrishnan R, Shaft U (1999) When is nearest neighbors meaningful? In: Processing of international conference on database theory, pp 217–235
van der Maaten LJP, Postma EO, van den Herik HJ (2009) Dimensionality reduction: a comparative review. Tilburg University Technical Report, TiCC-TR 2009-005
Yang B, Chen S (2010) Sample-dependent graph construction with application to dimensionality reduction. Neurocomputing 74(1–3):301–314
Acknowledgments
The authors would like to thank the anonymous reviewers for their valuable comments and suggestions to improve the quality of this paper. Thanks also go to Deng Cai and Xiaofei He for the code LSDA in their homepages. The authors also thank NSFC for support under grant Nos: 60973097 and 60105003.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Yang, B., Chen, S. & Wu, X. A structurally motivated framework for discriminant analysis. Pattern Anal Applic 14, 349–367 (2011). https://doi.org/10.1007/s10044-011-0228-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10044-011-0228-8