Skip to main content
Log in

Pattern classification with mixtures of weighted least-squares support vector machine experts

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Support Vector Machine (SVM) classifiers are high-performance classification models devised to comply with the structural risk minimization principle and to properly exploit the kernel artifice of nonlinearly mapping input data into high-dimensional feature spaces toward the automatic construction of better discriminating linear decision boundaries. Among several SVM variants, Least-Squares SVMs (LS-SVMs) have gained increased attention recently due mainly to their computationally attractive properties coming as the direct result of applying a modified formulation that makes use of a sum-squared-error cost function jointly with equality, instead of inequality, constraints. In this work, we present a flexible hybrid approach aimed at augmenting the proficiency of LS-SVM classifiers with regard to accuracy/generalization as well as to hyperparameter calibration issues. Such approach, named as Mixtures of Weighted Least-Squares Support Vector Machine Experts, centers around the fusion of the weighted variant of LS-SVMs with Mixtures of Experts models. After the formal characterization of the novel learning framework, simulation results obtained with respect to both binary and multiclass pattern classification problems are reported, ratifying the suitability of the novel hybrid approach in improving the performance issues considered.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21
Fig. 22
Fig. 23

Similar content being viewed by others

References

  1. Adankon MM, Cherieta M (2007) Optimizing resources in model selection for support vector machine. Pattern Recognit 40:953–963. doi:10.1016/j.patcog.2006.06.012

    Article  MATH  Google Scholar 

  2. An S, Liua W, Venkatesha S (2007) Fast cross-validation algorithms for least squares support vector machine and kernel ridge regression. Pattern Recognit 40:2154–2162. doi:10.1016/j.patcog.2006.12.015

    Article  MATH  Google Scholar 

  3. Andrzejak RG, Lehnertz K, Mormann F, Rieke C, David P, Elger CE (2001) Indications of nonlinear deterministic and finite dimensional structures in time series of brain electrical activity: Dependence on recording region and brain state. Phys Rev E Stat Nonlin Soft Matter Phys 64(6):061907. doi:10.1103/PhysRevE.64.061907

    Google Scholar 

  4. Burges CJC (1998) A tutorial on support vector machines for pattern recognition. Data Min Knowl Discov 2:121–167. doi:10.1023/A:1009715923555

    Article  Google Scholar 

  5. Cawley GC (2001) Model selection for support vector machines via adaptive step-size tabu search. In: Proceedings of international conference on artificial neural networks and genetic algorithms, Prague, pp 434–437

  6. Cawley GC (2006) Leave-one-out cross-validation based model selection criteria for weighted LS-SVMs. In: Proceedings of the international joint conference on neural networks. IEEE Press, Vancouver, pp 1661–1668

  7. Cawley GC, Talbot NLC (2002) Improved sparse least-squares support vector machines. Neurocomputing 48:1025–1031. doi:10.1016/S0925-2312(02)00606-9

    Article  MATH  Google Scholar 

  8. Cawley GC, Talbot NLC (2007) Preventing over-fitting during model selection via Bayesian regularisation of the hyper-parameters. J Mach Learn Res 8:841–861

    Google Scholar 

  9. Chapelle O, Vapnik V, Bousquet O, Mukherjee S (2002) Choosing multiple parameters for support vector machines. Mach Learn 46:131–159. doi:10.1023/A:1012450327387

    Article  MATH  Google Scholar 

  10. Cherkassky V, Ma Y (2004) Practical selection of SVM parameters and noise estimation for SVM regression. Neural Netw 17:113–126. doi:10.1016/S0893-6080(03)00169-2

    Article  MATH  Google Scholar 

  11. Collobert R, Bengio S, Bengio Y (2002) A parallel mixture of SVMs for very large scale problems. Neural Comput 14:1105–1114. doi:10.1162/089976602753633402

    Article  MATH  Google Scholar 

  12. Cristianini N, Shawe-Taylor J (2000) An Introduction to support vector machines. Cambridge University Press, London

    Google Scholar 

  13. de Diego IM, Moguerza JM, Muñoz A (2004) Combining kernel information for support vector classification. In: Proceedings of the international workshop on multiple classifier systems. Lecture notes in computer science, vol 3077. Springer, Berlin, pp 102–111

  14. Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc B 39:1–38

    MATH  MathSciNet  Google Scholar 

  15. Friedrichs F, Igel C (2005) Evolutionary tuning of multiple SVM parameters. Neurocomputing 64:107–117. doi:10.1016/j.neucom.2004.11.022

    Article  Google Scholar 

  16. Furey TS, Duffy N, Cristianini N, Bednarski D, Schummer M, Haussler D (2000) Support vector machine classification and validation of cancer tissue samples using microarray expression data. Bioinformatics 16:906–914. doi:10.1093/bioinformatics/16.10.906

    Article  Google Scholar 

  17. Hastie T, Tibshirani R, Friedman J (2001) The elements of statistical learning. Springer, Heidelberg

    MATH  Google Scholar 

  18. Haykin S (1999) Neural networks––a comprehensive foundation. Prentice Hall, New York

    MATH  Google Scholar 

  19. Hsu C-W, Lin C-J (2002) A comparison of methods for multi-class support vector machines. IEEE Trans Neural Netw 13:415–425. doi:10.1109/72.991427

    Article  Google Scholar 

  20. Jacobs R, Jordan M, Nowlan S, Hinton G (1991) Adaptive mixtures of local experts. Neural Comput 3:79–87. doi:10.1162/neco.1991.3.1.79

    Article  Google Scholar 

  21. Joachims T (2000) Estimating the generalization performance of an SVM efficiently. In: Proceedings of 17th international conference on machine learning. Morgan Kaufmann Publishers, San Francisco, pp 431–438

    Google Scholar 

  22. Jordan M, Jacobs R (1994) Hierarchical mixtures of experts and the EM algorithm. Neural Comput 6:181–214. doi:10.1162/neco.1994.6.2.181

    Article  Google Scholar 

  23. Kwok JT-Y (1998) Support vector mixture for classification and regression problems. In: Proceedins of the 14th international conference on pattern recognition, Brisbane, pp 255–258

  24. Lima CAM, Coelho ALV, Von Zuben FJ (2002) Ensembles of support vector machines for regression problems. In: Proceedings of the international joint conference on neural networks. IEEE Press, Hawaii, pp 2381–2386

    Google Scholar 

  25. Lima CAM, Coelho ALV, Von Zuben FJ (2002) Model selection based on VC-dimension for heterogeneous ensembles of support vector machines. In: Proceedings of the 4th international conference on recent advances in soft computing. Nottingham University Press, Nottingham, pp 459–464

    Google Scholar 

  26. Lima CAM, Coelho ALV, Von Zuben FJ (2007) Hybridizing mixtures of experts with support vector machines: investigation into nonlinear dynamic systems identification. Inf Sci 177:2049–2074. doi:10.1016/j.ins.2007.01.009

    Google Scholar 

  27. McLachlan GJ, Basford KE (1988) Mixture models: inference and applications to clustering. Marcel Deckker, Inc., New York

    MATH  Google Scholar 

  28. Moerland P (1999) Classification using localized mixture of experts. In: Proceedings of ninth international conference on artificial neural networks, vol 2, Edinburgh, pp 838–843

  29. Pelckmans K, Suykens JAK, De Moor B (2005) Building sparse representations and structure determination on LS-SVM substrates. Neurocomputing 64:137–159. doi:10.1016/j.neucom.2004.11.029

    Article  Google Scholar 

  30. Schölkopf B, Platt J, Shawe-Taylor J, Smola AJ, Williamson RC (2001) Estimating the support of a high-dimensional distribution. Neural Comput 13:1443–1471. doi:10.1162/089976601750264965

    Article  MATH  Google Scholar 

  31. Schölkopf B, Smola A (2002) Learning with kernels. The MIT Press, Cambridge

    Google Scholar 

  32. Subasi A (2007) EEG signal classification using wavelet feature extraction and a mixture of expert model. Expert Syst Appl 32:1084–1093. doi:10.1016/j.eswa.2006.02.005

    Article  Google Scholar 

  33. Suykens JAK, Vandewalle J (1999) Least squares support machine classifiers. Neural Process Lett 9:293–300. doi:10.1023/A:1018628609742

    Article  MathSciNet  Google Scholar 

  34. Suykens JAK, Lukas L, Van Dooren P, De Moor B, Vandewalle J (1999) Least squares support vector machine classifiers: a large scale algorithm. In: Proceedings of European conference on circuit theory and design, Italy, pp 839–842

  35. Suykens JAK, De Brabanter J, Lukas L, Vandewalle J (2002) Weighted least squares support vector machines: robustness and sparse approximation. Neurocomputing 48:85–105. doi:10.1016/S0925-2312(01)00644-0

    Article  MATH  Google Scholar 

  36. Suykens JAK, Van Gestel T, De Brabanter J, De Moor B, Vandewalle J (2002) Least squares support vector machines. World Scientific Pub, Singapore

    MATH  Google Scholar 

  37. Tikhonov AN, Arsenim VY (1977) Solutions of Ill-posed problems. W. H. Winston, Washington

    MATH  Google Scholar 

  38. Van Gestel T, Suykens JAK, Baesens B, Viaene S, Vanthienen J, Dedene G, De Moor B, Vandewalle J (2004) Benchmarking least squares support vector machine classifiers. Mach Learn 54:5–32. doi:10.1023/B:MACH.0000008082.80494.e0

    Article  MATH  Google Scholar 

  39. Vapnik VN (1998) Statistical learning theory. Wiley, New York

    MATH  Google Scholar 

  40. Wahba G (1998) Support vector machines, reproducing kernel Hilbert spaces and the randomized GACV. In: Schölkopf B, Burges C, Smola A (eds) Advances in kernel methods: support vector machines. The MIT Press, Cambridge, pp 69–88

    Google Scholar 

  41. Webb A (1999) Statistical pattern recognition. Wiley, New York

    MATH  Google Scholar 

Download references

Acknowledgments

Fapesp sponsored the work of the first author via process # 04/09597-0, CNPq/Funcap sponsored the work of the second author via process #23661-04, and CNPq sponsored the work of the third author via grant #303214/2007-0.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Clodoaldo A. M. Lima.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lima, C.A.M., Coelho, A.L.V. & Von Zuben, F.J. Pattern classification with mixtures of weighted least-squares support vector machine experts. Neural Comput & Applic 18, 843–860 (2009). https://doi.org/10.1007/s00521-008-0210-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-008-0210-6

Keywords

Navigation