Skip to main content
Log in

An overview of kernel alignment and its applications

  • Published:
Artificial Intelligence Review Aims and scope Submit manuscript

Abstract

The success of kernel methods is very much dependent on the choice of kernel. Kernel design and learning a kernel from the data require evaluation measures to assess the quality of the kernel. In recent years, the notion of kernel alignment, which measures the degree of agreement between a kernel and a learning task, is widely used for kernel selection due to its effectiveness and low computational complexity. In this paper, we present an overview of the research progress of kernel alignment and its applications. We introduce the basic idea of kernel alignment and its theoretical properties, as well as the extensions and improvements for specific learning problems. The typical applications, including kernel parameter tuning, multiple kernel learning, spectral kernel learning and feature selection and extraction, are reviewed in the context of classification framework. The relationship between kernel alignment and other evaluation measures is also explored. Finally, concluding remarks and future directions are presented.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Amayri O, Bouguila N (2010) A study of spam filtering using support vector machines. Artif Intell Revi 34(1): 73–108

    Article  Google Scholar 

  • Baram Y (2005) Learning by kernel polarization. Neural Comput 17(6): 1264–1275

    Article  MATH  MathSciNet  Google Scholar 

  • Camargo JE, González FA (2009) A multi-class kernel alignment method for image collection summarization. In: Proceedings of the 14th Iberoamerican conference on pattern recognition: progress in pattern recognition, image analysis, computer vision, and applications, Guadalajara, Mexico, pp 545–552

  • Chapelle O, Vapnik V, Mukherjee S (2002) Choosing multiple parameters for support vector machines. Mach Learn 46(1): 131–159

    Article  MATH  Google Scholar 

  • Chapelle O, Zien A, Schölkopf B (2006) Semi-supervised learning. MIT Press, Cambridge, MA

    Book  Google Scholar 

  • Chen B, Liu H, Bao Z (2008) A kernel optimization method based on the localized kernel Fisher criterion. Pattern Recognit 41(3): 1098–1109

    Article  MATH  Google Scholar 

  • Chudzian P (2012) Evaluation measures for kernel optimization. Pattern Recognit Lett 33(9): 1108–1116

    Article  Google Scholar 

  • Chung KM, Kao WC, Sun T, Wang LL, Lin CJ (2003) Radius margin bounds for support vector machines with the RBF kernel. Neural Comput 15(11): 2463–2681

    Article  Google Scholar 

  • Cortes C, Mohri M, Rostamizadeh A (2012) Algorithms for learning kernels based on centered alignment. J Mach Learn Res 13: 795–828

    MATH  MathSciNet  Google Scholar 

  • Cristianini N, Shawe-Taylor J, Elisseeff A, Kandola J (2001) On kernel-target alignment. In: Dietterich TG, Becker S, Ghahraman Z (eds) Advances in Neural Information Processing Systems 14, MIT Press, Cambridge, MA, pp 367–373

  • Ding S, Zhu H, Jia W, Su C (2012) A survey on feature extraction for pattern recognition. Artif Intell Rev 37(3): 169–180

    Article  Google Scholar 

  • Duan KB, Keerthi SS, Poo AN (2003) Evaluation of simple performance measures for tuning SVM hyperparameters. Neurocomputing 51: 41–59

    Article  Google Scholar 

  • Girolami M, Rogers S (2005) Hierarchic Bayesian models for kernel learning. In: Proceedings of the 22nd international conference on machine learning, Bonn, Germany, pp 241–248

  • Gönen M, Alpayın E (2011) Multiple kernel learning algorithms. J Mach Learn Res 12: 2211–2268

    MATH  MathSciNet  Google Scholar 

  • Guermeur Y, Lifchitz A, Vert R (2004) A kernel for protein secondary structure prediction. In: Schölkopf B, Tsuda K, Vert JP (eds) Kernel Methods in Computational Biology, MIT Press, Cambridge, MA, pp 193–206

  • Hoi SCH, Lyu MR, Chang EY (2006) Learning the unified kernel machines for classification. In: Proceedings of the 12th ACM SIGKDD international conference on knowledge discovery and data mining, Philadelphia, USA, pp 187–196

  • Hsu CW, Lin CJ (2002) A comparison of methods for multiclass support vector machines. IEEE Trans Neural Netw 13(2): 415–425

    Article  Google Scholar 

  • Igel C, Glasmachers T, Mersch B, Pfeifer N, Meinicke P (2007) Gradient-based optimization of kernel-target alignment for sequence kernels applied to bacterial gene start detection. IEEE/ACM Trans Comput Biol Bioinform 4(2): 216–226

    Article  Google Scholar 

  • Jain AK, Duin RPW, Mao J (2000) Statistical pattern recognition: a review. IEEE Trans Pattern Anal Mach Intell 22(1): 4–37

    Article  Google Scholar 

  • Kandola J, Shawe-Taylor J, Cristianini N (2002a) On the extensions of kernel alignment. Technical report 120, Department of Computer Science, University of London

  • Kandola J, Shawe-Taylor J, Cristianini N (2002b) Optimizing kernel alignment over combinations of kernels. Technical report 121, Department of Computer Science, University of London

  • Kawanabe M, Nakajima S, Binder A (2009) A procedure of adaptive kernel combination with kernel-target alignment for object classification. In: Proceedings of the 8th ACM international conference on image and video retrieval, Santorini Island, Greece

  • Keerthi SS, Lin CJ (2003) Asymptotic behaviors of support vector machines with Gaussian kernel. Neural Comput 15(7): 1667–1689

    Article  MATH  Google Scholar 

  • Lanckriet GRG, Cristianini N, Bartlett P, Ghaoui LE, Jordan MI (2004) Learning the kernel matrix with semidefinite programming. J Mach Learn Res 5: 27–72

    MATH  Google Scholar 

  • Lee SW, Bien Z (2010) Representation of Fisher criterion function in a kernel feature space. IEEE Trans Neural Netw 21(2): 333–339

    Article  Google Scholar 

  • Liu Y, Liao S, Hou Y (2011) Learning kernels with upper bounds of leave-one-out error. In: Proceedings of the 20th ACM conference on information and knowledge management, Glasgow, UK, pp 2205–2208

  • Meila M (2003) Data centering in feature space. In: Proceedings of the 9th international workshop on artificial intelligence and statistics, Key West, USA

  • Neumann J, Schnörr C, Steidl G (2005) Combined SVM-based feature selection and classification. Mach Learn 61(1–3): 129–150

    Article  MATH  Google Scholar 

  • Nguyen CH, Ho TB (2008) An efficient kernel matrix evaluation measure. Pattern Recognit 41(11): 3366–3372

    Article  MATH  Google Scholar 

  • Ong CS, Williamson RC (2005) Learning the kernel with hyperkernels. J Mach Learn Res 6: 1043–1071

    MATH  MathSciNet  Google Scholar 

  • Pothin J-B, Richard C (2006) A greedy algorithm for optimizing the kernel alignment and the performance of kernel machines. In: Proceedings of 14th European signal processing conference, Florence, Italy, pp 4–8

  • Pothin J-B, Richard C (2007) Optimal feature representation for kernel machines using kernel-target alignment criterion. In: Proceedings of the IEEE international conference on acoustics, speech, and signal processing, Honolulu, USA, vol 3, pp 1065–1068

  • Pothin J-B, Richard C (2008) Optimizing kernel alignment by data translation in feature space. In: Proceedings of the IEEE international conference on acoustics, speech, and signal processing, Las Vegas, USA, pp 3345–3348

  • Qiu S, Lane T (2009) A framework for multiple kernel support vector regression and its applications to siRNA efficacy prediction. IEEE/ACM Trans Comput Biol Bioinf 6(2): 190–199

    Article  Google Scholar 

  • Ramona M, Richard G, David B (2012, to appear) Multiclass feature selection with kernel Gram-matrix-based criteria. IEEE Trans Neural Netw Learn Syst

  • Rifkin R, Klautau A (2004) In defense of one-vs-all classification. J Mach Learn Res 5: 101–141

    MATH  MathSciNet  Google Scholar 

  • Shawe-Taylor J, Cristianini N (2004) Kernel methods for pattern analysis. Cambridge University Press, New York

    Book  Google Scholar 

  • Tsochantaridis I, Joachims T, Hofmann T, Altun Y (2005) Large margin methods for structured and interdependent output variables. J Mach Learn Res 6: 1453–1484

    MATH  MathSciNet  Google Scholar 

  • Vapnik V (1998) Statistical learning theory. Wiley, New York

    MATH  Google Scholar 

  • Vert R (2002) Designing a m-SVM kernel for protein secondary structure prediction. Master’s thesis, DEA Informatique de Lorraine

  • Wang L (2008) Feature selection with kernel class separability. IEEE Trans Pattern Anal Mach Intell 30(9): 1534–1546

    Article  Google Scholar 

  • Wang L, Xue P, Chan KL (2008) Two criteria for model selection in multiclass support vector machines. IEEE Trans Syst Man Cybern B Cybern 38(6): 1432–1448

    Article  Google Scholar 

  • Wang T, Tian S, Huang H, Deng D (2009) Learning by local kernel polarization. Neurocomputing 72(13–15): 3077–3084

    Article  Google Scholar 

  • Wong WWL, Burkowski FJ (2009) Using kernel alignment to select features of molecular descriptors in a QSAR study. IEEE/ACM Trans Comput Biol Bioinform 8(5): 1373–1384

    Article  Google Scholar 

  • Wu M, Farquhar J (2007) A subspace kernel for nonlinear feature extraction. In: Proceedings of the 20th international joint conference on artificial intelligence, Hyderabad, India, pp 1125–1130

  • Xiong H, Swamy MNS, Ahmad MO (2005) Optimizing the kernel in the empirical feature space. IEEE Trans Neural Netw 16(2): 460–474

    Article  Google Scholar 

  • Zhu X, Kandola J, Ghahramani Z, Lafferty J (2004) Nonparametric transforms of graph kernels for semi-supervised learning. In: Saul LK, Weiss Y, Bottou L (eds) Advances in Neural Information Processing Systems 17, MIT Press, Cambridge, MA

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tinghua Wang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, T., Zhao, D. & Tian, S. An overview of kernel alignment and its applications. Artif Intell Rev 43, 179–192 (2015). https://doi.org/10.1007/s10462-012-9369-4

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10462-012-9369-4

Keywords

Navigation