Abstract
Kernel discriminant analysis (KDA) is one of the state-of-the-art kernel-based methods for pattern classification and dimensionality reduction. It performs linear discriminant analysis in the feature space via kernel function. However, the performance of KDA greatly depends on the selection of the optimal kernel for the learning task of interest. In this paper, we propose a novel algorithm termed as elastic multiple kernel discriminant analysis (EMKDA) by using hybrid regularization for automatically learning kernels over a linear combination of pre-specified kernel functions. EMKDA makes use of a mixing norm regularization function to compromise the sparsity and non-sparsity of the kernel weights. A semi-infinite program based algorithm is then proposed to solve EMKDA. Extensive experiments on synthetic datasets, UCI benchmark datasets, digit and terrain database are conducted to show the effectiveness of the proposed methods.
Similar content being viewed by others
References
Schölkopf B, Smola AJ (2002) Learning with kernels: support vector machines, regularization, optimization, and beyond. MIT Press, Cambridge Mass
Shawe-Taylor J, Cristianini N (2004) Kernel methods for pattern analysis. Cambridge University Press, Cambridge
Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20: 273–297
Vapnik VN (1998) Statistical learning theory. Wiley, New York
Schölkopf B, Smola A, Müller KR (1998) Nonlinear component analysis as a kernel eigenvalue problem. Neural Comput 10: 1299–1319
Mika S, Ratsch G, Weston J, Scholkopf B, Mullers K (1999) Fisher discriminant analysis with kernels. In: Proceedings of IEEE workshop on neural networks for signal processing, Madison, WI
Yang J, Jin Z, Zhang D, Frangi AF (2004) Essence of kernel Fisher discriminant: KPCA plus LDA. Pattern Recognit 37: 2097–2100
Lanckriet GRG, Cristianini N, Bartlett P, Ghaoui LE, Jordan MI (2004) Learning the kernel matrix with semidefinite programming. J Mach Learn Res 5: 27–72
Bach FR, Lanckriet GRG, Jordan MI (2004) Multiple kernel learning, conic duality, and the SMO algorithm. In: Proceeding ICML ’04 Proceedings of the twenty-first international conference on Machine learning, ACM New York, NY
Sonnenburg S, Rätsch G, Schäfer C, Schälkopf B (2006) Large scale multiple kernel learning. J Mach Learn Res 7: 1531–1565
Kloft M, Brefeld U, Laskov P, Sonnenburg S (2008) Non-sparse multiple kernel learning. NIPS workshop on kernel learning: automatic selection of optimal kernels, Whistler
Kloft M, Brefeld U, Sonnenburg S, Laskov P, Müller KR, Zien A (2009) Efficient and accurate lp-norm multiple kernel learning. Adv Neural Inf Proc Syst 22: 997–1005
Yang H, Xu Z, Ye J, King I, Lyu MR (2011) Efficient sparse generalized multiple kernel learning. IEEE Trans Neural Netw 22: 433–446
Fu L, Zhang M, Li H (2010) Sparse RBF networks with multi-kernels. Neural Proc Lett 32: 235–247
Yang J, Frangi AF, Zhang D, Jin Z (2005) KPCA plus LDA: a complete kernel Fisher discriminant framework for feature extraction and recognition. IEEE Trans Pattern Anal Mach Intell 27: 230–244
Xu Y, Zhang D, Jin Z, Li M, Yang JY (2006) A fast kernel-based nonlinear discriminant analysis for multi-class problems. Pattern Recognit 39: 1026–1033
Boyd S, Vandenberghe L (2004) Convex optimization. Cambridge University Press, New York, NY
Fung G, Dundar M, Bi J, Rao B (2004) A fast iterative algorithm for fisher discriminant using heterogeneous kernels. In: Proceedings of the 21st international conference on machine learning, Banff
Mika S, Ratsch G, Muller KR (2001) A mathematical programming approach to the kernel fisher algorithm. Advances in neural information processing systems 591–597
Kim SJ, Magnani A, Boyd S (2006) Optimal kernel selection in kernel fisher discriminant analysis, in Proceedings of ICML
Ye J, Ji S, Chen J (2008) Multi-class discriminant kernel learning via convex programming. J Mach Learn Res 9: 719–758
Khemchandani R (2010) Learning the optimal kernel for Fisher discriminant analysis via second order cone programming. Eur J Operational Res 203: 692–697
Liang Z, Li Y (2010) Multiple kernels for generalised discriminant analysis. IET Comput Vision 4: 117–128
Lin YY, Liu TL, Fuh CS (2011) Multiple kernel learning for dimensionality reduction. IEEE Trans Pattern Anal Mach Intell 33: 1147–1160
Yan F, Kittler J, Mikolajczyk K, Tahir A (2009) Non-sparse multiple kernel learning for fisher discriminant analysis. In: Proceedings of ICDM, Miami, FL, pp 1064–1069
Yan F, Mikolajczyk K, Barnard M, Cai H, Kittler J (2010) Lp norm multiple kernel Fisher discriminant analysis for object and image categorisation. In: Proceedings of CVPR, San Francisco, CA, pp 3626–3632
Zou H, Hastie T (2005) Regularization and variable selection via the elastic net. J R Stat Soc 67: 301–320
Hettich R, Kortanek KO (1993) Semi-infinite programming: theory, methods, and applications. SIAM Rev 35: 380–429
Sun D, Zhang D (2009) A new discriminant principal component analysis method with partial supervision. Neural Proc Lett 30: 103–112
Chen X, Yang J, Liang J (2011) Optimal locality regularized least squares support vector machine via alternating optimization. Neural Proc Lett 33: 301–315
Ojala T, Pietikainen M, Maenpaa T (2002) Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans Pattern Anal Mach Intell 24: 971–987
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Liang, J., Chen, L. & Chen, X. Discriminant Kernel Learning Using Hybrid Regularization. Neural Process Lett 36, 257–273 (2012). https://doi.org/10.1007/s11063-012-9234-0
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11063-012-9234-0