Skip to main content
Log in

Evolutionary strategies for hyperparameters of support vector machines based on multi-scale radial basis function kernels

  • Original Paper
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

Kernel functions are used in support vector machines (SVM) to compute inner product in a higher dimensional feature space. SVM classification performance depends on the chosen kernel. The radial basis function (RBF) kernel is a distance-based kernel that has been successfully applied in many tasks. This paper focuses on improving the accuracy of SVM by proposing a non-linear combination of multiple RBF kernels to obtain more flexible kernel functions. Multi-scale RBF kernels are weighted and combined. The proposed kernel allows better discrimination in the feature space. This new kernel is proved to be a Mercer’s kernel. Furthermore, evolutionary strategies (ESs) are used for adjusting the hyperparameters of SVM. Training accuracy, the bound of generalization error, and subset cross-validation on training accuracy are considered to be objective functions in the evolutionary process. The experimental results show that the accuracy of multi-scale RBF kernels is better than that of a single RBF kernel. Moreover, the subset cross-validation on training accuracy is more suitable and it yields the good results on benchmark datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  • Ayat NE, Cheriet M, Remaki L, Suen CY (2001) KMOD—a new support vector machine kernel with moderate decreasing for pattern recognition. In: Proceedings on document analysis and recognition, pp 1215–1219

  • Bartlett P, Shawe-Taylor J (1998) Generalization performance of support vector machines and other pattern classifiers. In: Scholkopf B, Burges CJC, Smola AJ (eds) Advances in kernel methods—support vector learning. MIT Press, Cambridge

    Google Scholar 

  • Beyer H-G, Schwefel HP (2002) Evolution strategies: a comprehensive introduction. Nat Comput 1(1):3–52

    Article  MATH  MathSciNet  Google Scholar 

  • Blake CL, Merz CJ (1998) UCI repository of machine learning databases [Online]. University of California, Department of Information and Computer Science, Irvine, CA. http://www.ics.uci.edu/~mlearn/MLRepository.html

  • Blum A, Kalai A, Langford J (1999) Beating the hold-out: bounds for K-fold and progressive cross-validation. Computational Learing Theory, pp 203–208

  • Blumer A, Ehrenfeucht A, Haussler D, Warmuth MK (1989) Learnability and the Vapnik–Chervonenkis dimension. J ACM 36(4):929–965

    Article  MATH  MathSciNet  Google Scholar 

  • Burges C (1998) A tutorial on support vector machines for pattern recognition. Data Min Knowl Discov 2(2):121–167

    Article  Google Scholar 

  • Chapelle O, Vapnik V, Bousquet O, Mukherjee S (2002) Choosing multiple parameters for support vector machines. Mach Learn 46(1):131–159

    Article  MATH  Google Scholar 

  • Chunhong Z, Licheng J (2004) Automatic parameters selection for SVM based on GA. In: Proceeding of the 5th World congress on intelligent control and automation, Hangzhou, China 2: 1869-1872

  • Cristianini N, Shawe-Taylor J (2000) An introduction to support vector machines and other kernel-based learning methods. Cambridge University Press, UK

    Google Scholar 

  • deDoncker E, Gupta A, Greenwood G (1996) Adaptive integration using evolutionary strategies. In: Proceedings of 3rd international conference on high performance computing, pp 94–99

  • Demsar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30

    MathSciNet  Google Scholar 

  • Eads DR, Hill D, Davis S, Perkins S, Ma J, Porter R, Theiler J (2002) Genetic algorithms and support vector machines for time series classification. In: Proceedings of SPIF vol 4787, pp 74–85

  • Fleuret F, Sahbi H (2002) Scale-invariance of support vector machines based on the triangular kernel. INRIA Research Report, N 4601, October 2002

  • Fogel DB (1995) Evolutionary computation: toward a new philosophy of machine intelligence. IEEE Press, Piscataway

  • Friedman M (1937) The use of ranks to avoid the assumption of normality implicit in the analysis of variance. J Am Stat Assoc 32:675–701

    Article  Google Scholar 

  • Friedrichs F, Igel C (2004) Evolutionary tuning of multiple SVM parameters. In: 12th european symposium on artificial neural networks (ESANN 2004). pp 519–524

  • Fröhlich H, Zell A (2005) Efficient parameter selection for support vector machines in classification and regression via model-based global optimization. In: IEEE international joint conference on neural networks (IJCNN 2005) 3:1431-1436

  • Fröhlich H, Chapelle O, Schölkopf B (2003) Feature selection for support vector machines by means of genetic algorithms. In: 15th IEEE international conference on tools with AI (ICTAI 2003), pp 142–148

  • Garcia S, Herrera F (2008) An extension on statistical comparisons of classifiers over multiple data sets for all pairwise comparisons. J Mach Learn Res 9:2677–2694

    Google Scholar 

  • Garcia S, Fernández A, Luengo J, Herrera F (2009) A study of statistical techniques and performance measures for genetics-based machine learning: accuracy and interpretability. Soft Comput 13(10):959–977

    Article  Google Scholar 

  • Goldberg DE (1989) Genetic algorithms in search, optimization and machine learning. Addison-Wesley, US

    MATH  Google Scholar 

  • Guo XC, Yang JH, Wu CG, Wang CY, Liang YC (2008) A novel LS-SVMs hyper-parameter selection based on particle swarm optimization. Neurocomputing 71(16–18):3211–3215

    Article  Google Scholar 

  • Herrera F, Lozano M, Verdegay JL (1996) Tackling real-coded genetic algorithms: operators and tools for behavioural analysis. Artif Intell Rev 12(4):265–319

    Article  Google Scholar 

  • Hiroshi S, Ken-ichi N, Mitsuru N (2001) Dynamic Time-Alignment Kernel in Support Vector Machine. Adv Neural Inf Process Syst NIPS2001 14(2):921–928

    Google Scholar 

  • Holm S (1979) A simple sequentially rejective multiple test procedure. Scand J Stat 6:65–70

    MATH  MathSciNet  Google Scholar 

  • Howley T, Madden MG (2005) The genetic kernel support vector machine: description and evaluation. Artif Intell Rev 24:379–395

    Article  Google Scholar 

  • Igel C (2005) Multi-objective model selection for support vector machines. In: Proceedings of the third international conference on evolutionary multi-criterion optimization vol 3410, pp 534–546

  • Iman L, Davenport JM (1980) Approximations of the critical region of the friedman statistic. Communications in Statistics A9: 571–595

    Google Scholar 

  • Kääriäinen M, Langford J (2005) A comparison of tight generalization error bounds. In: Proceedings of the 22nd international conference on machine learning, Bonn, Germany, pp 409–416

  • Kecman V (2001) Learning and soft computing: support vector machines, neural networks, and fuzzy logic models. MIT Press, London

    MATH  Google Scholar 

  • Koji T (1999) Support vector classifier with asymmetric kernel functions. In: Proceedings of European symposium on artificial neural networks, Bruges (Belgium), pp. 183–188

  • Markatou M, Tian H, Biswas S, Hripcsak G (2005) Analysis of variance of cross-validation estimators of the generalization error. Journal of Machine Learning Research 6:1127–1168

    MathSciNet  Google Scholar 

  • Mitchell TM (1997) Machine Learning. McGraw-Hill, New York

    MATH  Google Scholar 

  • Müller K, Mika S, Rätsch G, Tsuda K, Schölkopf B (2001) An introduction to kernel-based learning algorithm. IEEE Transactions on Neural Networks 12(2):181–201

    Article  Google Scholar 

  • Ong C, Smola A, Williamson R (2005) Machine learning using hyperkernels. J Mach Learn Res 6:1043–1071

    MathSciNet  Google Scholar 

  • Rameswar D, Haruhisa T (2004) Kernel selection for the support vector machines. IEICE Trans Inf Syst E87-D(2)

  • Rechenberg I (1965) Cybernetic solution path of an experimental problem. Ministry of Aviation, Royal Aircraft Establishment, UK

  • Rechenberg I (1973) Evolutionsstrategie: Optimierung Technischer Systeme Nach Prinzipien der Biologischen Evolution. Frommann-Holzboog, Stuttgart

    Google Scholar 

  • Runarsson TP, Sigurdsson S (2004) Asynchronous parallel evolutionary model selection for support vector machines. Neural Information Processing—Letter and Reviews, 3(3)

  • Russell S, Norvig P (2003) Artificial intelligence: a modern approach. Prentice-Hall, Englewood Cliffs

    Google Scholar 

  • Schölkopf B, Smola A (2002) Learning with kernels: support vector machines, regularization, optimization, and beyond. MIT Press, London

    Google Scholar 

  • Schölkopf B, Burges C, Smola A (1998) Advances in kernel methods: support vector machines. MIT Press, Cambridge

    MATH  Google Scholar 

  • Schwefel H-P (1981) Numerical optimization for computer models. Wiley, Chichester

    Google Scholar 

  • Schwefel H-P (1995) Evolution and optimum seeking. Wiley, New York

    Google Scholar 

  • Shawe-Taylor J, Cristianini N (2004) Kernel methods for pattern analysis. Cambridge University Press, UK

    Google Scholar 

  • Smits GF, Jordaan EM (2002) Improved SVM regression using mixtures of kernels. In: Proceedings of the 2002 international joint conference on neural networks vol 3, pp 2785–2790

  • Storn R, Price K (1997) Differential evolution—a simple and efficient heuristic for global optimization over continuous spaces. J Global Optim 11:341–359

    Article  MATH  MathSciNet  Google Scholar 

  • Tan Y, Wang J (2004) Support vector machine with a hybrid kernel and minimal Vapnik–Chervonenkis dimension. IEEE Trans on Know Data Eng 16(4):358–395

    Google Scholar 

  • Vapnik VN (1995) The nature of statistical learning theory. Springer, New York

    MATH  Google Scholar 

  • Vapnik VN (1998) Statistical Learning Theory. John Wiley and Sons, New York

    MATH  Google Scholar 

  • Vapnik V, Chervonenkis A (1971) On the uniform convergence of relative frequencies of events to their probabilities. Theory Probab Appl 16(2):264–280

    Article  MATH  MathSciNet  Google Scholar 

  • Xuefeng L, Fang L (2002) Choosing multiple parameters for SVM based on genetic algorithm. In: International conference on signal processing (ICSP 2002) vol 1, pp 117–119

  • Zhang L, Zhou W, Jiao L (2000) Support vector machines based on scaling kernels. In: 6th international conference on signal processing, vol 2, pp 1142–1145

  • Zhang L, Zhou W, Jiao L (2004) Wavelet support vector machine. Trans Syst Man Cybern B Cybern 34:34–39

    Google Scholar 

  • Zhou S, Wu L, Yuan X, Tan W (2007) Parameters selection of SVM for function approximation based on differential evolution. In: Proceedings of the international conference on intelligent systems and knowledge engineering (ISKE 2007), Chengdu, China

Download references

Acknowledgments

The authors acknowledge the financial support provided by the Thailand Research Fund, the Royal Golden Jubilee Ph.D. Program, and the 90th Anniversary of Chulalongkorn University Fund (Ratchadaphiseksomphot Endowment Fund). The authors also would like to thank Ananlada Chotimongkol for proofreading the paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Boonserm Kijsirikul.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Phienthrakul, T., Kijsirikul, B. Evolutionary strategies for hyperparameters of support vector machines based on multi-scale radial basis function kernels. Soft Comput 14, 681–699 (2010). https://doi.org/10.1007/s00500-009-0458-5

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-009-0458-5

Keywords

Navigation