Skip to main content
Log in

Border-sensitive learning in generalized learning vector quantization: an alternative to support vector machines

  • Focus
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

Learning vector quantization (LVQ) algorithms as powerful classifier models for class discrimination of vectorial data belong to the family of prototype-based classifiers with a learning scheme based on Hebbian learning as a widely accepted neuronal learning paradigm. Those classifier approaches estimate the class distribution and generate from this a class decision for vectors to be classified. The estimation can be done by the determination of class-typical sensitive prototypes inside the class distribution area like in LVQ or by detection of the class borders for class discrimination as preferred by support vector machines (SVMs). Both strategies provide advantages and disadvantages depending on the given classification task. Whereas LVQs are very intuitive and usually process the data during learning in the data space, frequently equipped with variants of the Euclidean metric, SVMs implicitly map the data into a high-dimensional kernel-induced feature space for better separation. In this Hilbert space, the inner product is compliant to the kernel. However, this implicit mapping makes a vivid interpretation more difficult. As an alternative, we propose in this paper two modifications of LVQ to make it comparable to SVM: first border-sensitive learning is introduced to achieve border-responsible prototypes comparable with support vectors in SVM. Second, kernel distances for differentiable kernels are considered, such that prototype learning takes place in a metric space isomorphic to the feature mapping space of SVM. Combination of both features gives a powerful prototype-based classifier while keeping the easy interpretation and the intuitive Hebbian learning scheme of LVQ.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Notes

  1. The theory of universal kernels is out of the focus of this paper, we explicitly refer to Steinwart (2001) and Micchelli et al. (2006) for precise definition and consideration of there properties. Here, we only remark that exponential kernels belong to the set of universal kernels.

References

  • Aronszajn N (1950) Theory of reproducing kernels. Trans Am Math Soc 68:337–404

    Article  MathSciNet  MATH  Google Scholar 

  • Barthel H, Villmann T, Hermann W, Hesse S, Kühn HJ, Wagner A, Kluge R (2001) Different patterns of brain glucose consumption in Wilsons disease. Zeitschrift für Gastroenterologie 39:241

    Google Scholar 

  • Bengio Y (2009) Learning deep architectures for AI. Found Trends Mach Learn 2(1):1–127

    Article  MathSciNet  MATH  Google Scholar 

  • Biehl M, Hammer B, Villmann T (2014) Distance measures for prototype based classification. In: Petkov N (ed) Proceedings of the international workshop on brain-inspired computing 2013 (Cetraro/Italy). Springer, Berlin

  • Blake C, Merz C (1998) UCI repository of machine learning databases. University of California, Irvine, CA, Department of Information and Computer Science. http://www.ics.edu/mlearn/MLRepository.html

  • Bunte K, Schneider P, Hammer B, Schleif FM, Villmann T, Biehl M (2012) Limited rank matrix learning, discriminative dimension reduction and visualization. Neural Netw 26(1):159–173

    Article  Google Scholar 

  • Caruana R, Niculescu-Mizil A (2006) An empirical comparison of supervised learning algorithms. In: Proceedings of the 23rd international conference on machine learning. ACM, New York, pp 161–168

  • Chang CC, Lin CJ (2011) LIBSVM: a library for support vector machines. ACM Trans Intell Syst Technol 2(3:27):1–27

  • Crammer K, Gilad-Bachrach R, Navot A, Tishby A (2003) Margin analysis of the LVQ algorithm. In: Becker S, Thrun K, Obermayer K (eds.) Advances in neural information processing (Proc. NIPS 2002), vol 15. MIT Press, Cambridge, pp 462–469

  • Cristianini N, Shawe-Taylor J (2000) An introduction to support vector machines and other kernel-based learning methods. Cambridge University Press, Cambridge

  • Duda R, Hart P (1973) Pattern classification and scene analysis. Wiley, New York

    MATH  Google Scholar 

  • Fritzke B (1995) A growing neural gas network learns topologies. In: Tesauro G, Touretzky DS, Leen TK (eds) Advances in neural information processing systems, vol 7. MIT Press, Cambridge, pp 625–632

  • Günther P, Villmann T, Hermann W (2011) Event related potentials and cognitive evaluation in Wilson’s disease with and without neurological manifestation. J Neurol Sci [Turkish] 28(1):79–85

    Google Scholar 

  • Gu Z, Shao M, Li L, Fu Y (2012) Discriminative metric: Schatten norms vs. vector norm. In: Proceedings of the 21st international conference on pattern recognition (ICPR 2012), pp 1213–1216

  • Hammer B, Nebel D, Riedel M, Villmann T (2014) Generative versus discriminative prototype based classification. In: Villmann T, Schleif FM, Kaden M, Lange M (eds) Advances in self-organizing maps and learning vector quantization: proceedings of 10th international workshop WSOM 2014, Mittweida. Advances in intelligent systems and computing, vol 295. Springer, Berlin, pp 123–132

  • Hammer B, Strickert M, Villmann T (2005) On the generalization ability of GRLVQ networks. Neural Process Lett 21(2):109–120

    Article  Google Scholar 

  • Hammer B, Strickert M, Villmann T (2005) Supervised neural gas with general similarity measure. Neural Process Lett 21(1):21–44

    Article  Google Scholar 

  • Hammer B, Villmann T (2002) Generalized relevance learning vector quantization. Neural Netw 15(8–9):1059–1068

    Article  Google Scholar 

  • Hasenjäger M, Ritter H (1998) Active learning with local models. Neural Process Lett 7:107–117

    Article  Google Scholar 

  • Hasenjäger M, Ritter H, Obermayer K (1999) Active learning in self-organizing maps. In: Oja E, Kaski S (eds) Kohonen maps. Elsevier, Amsterdam, pp 57–70

  • Hastie T, Tibshirani R, Friedman J (2001) The elements of statistical learning. Springer, Heidelberg

    Book  MATH  Google Scholar 

  • Haykin S (1994) Neural networks—a comprehensive foundation. IEEE Press, New York

    MATH  Google Scholar 

  • Hermann W, Barthel H, Hesse S, Grahmann F, Kühn HJ, Wagner A, Villmann T (2002) Comparison of clinical types of Wilson’s disease and glucose metabolism in extrapyramidal motor brain regions. J Neurol 249(7):896–901

    Article  Google Scholar 

  • Hermann W, Günther P, Wagner A, Villmann T (2005) Klassifikation des Morbus Wilson auf der Basis neurophysiologischer Parameter. Der Nervenarzt 76:733–739

    Article  Google Scholar 

  • Hermann W, Villmann T, Grahmann F, Kühn H, Wagner A (2003) Investigation of fine motoric disturbances in Wilson’s disease. Neurol Sci 23(6):279–285

    Article  Google Scholar 

  • Hermann W, Villmann T, Wagner A (2003) Elektrophysiologisches Schädigungsprofil von Patienten mit einem Morbus Wilson’. Der Nervenarzt 74(10):881–887

    Article  Google Scholar 

  • Hermann W, Wagner A, Kühn HJ, Grahmann F, Villmann T (2005) Classification of fine-motoric disturbances in Wilson’s disease using artificial neural networks. Acta Neurologica Scandinavia 111(6):400–406

    Article  Google Scholar 

  • Herrmann M, Bauer HU, Der R (1994) The ’perceptual magnet’ effect: a model based on self-organizing feature maps. In: Smith LS, Hancock PJB (eds) Neural computation and psychology. Springer, Stirling, pp 107–116

  • Horn R, Johnson C (2013) Matrix analysis, 2nd edn. Cambridge University Press, Cambridge

  • Kaden M, Hermann W, Villmann T (2014) Optimization of general statistical accuracy measures for classification based on learning vector quantization. In: Verleysen M (ed) Proceedings of European symposium on artificial neural networks, computational intelligence and machine learning (ESANN’2014). i6doc.com, Louvain-La-Neuve, Belgium, pp 47–52

  • Kaden M, Lange M, Nebel D, Riedel M, Geweniger T, Villmann T (2014) Aspects in classification learning—review of recent developments in learning vector quantization. Found Comput Decis Sci 39(2):79–105

    MathSciNet  Google Scholar 

  • Klingner M, Hellbach S, Riedel M, Kaden M, Villmann T, Böhme HJ (2014) RFSOM—extending self-organizing feature maps with adaptive metrics to combine spatial and textural features for body pose estimation. In: Villmann T, Schleif FM, Kaden M, Lange M (eds) Advances in self-organizing maps and learning vector quantization: proceedings of 10th international workshop WSOM 2014, Mittweida. Advances in intelligent systems and computing, vol 295. Springer, Berlin, pp 157–166

  • Kohonen T (1986) Learning vector quantization for pattern recognition. Report TKK-F-A601, Helsinki University of Technology, Espoo, Finland

  • Kohonen T (1990) Improved versions of learning vector quantization. In: Proceedings of IJCNN-90, international joint conference on neural networks, San Diego, vol I. IEEE Service Center, Piscataway, pp 545–550

  • Kohonen T (1995) Self-organizing maps. Springer Series in Information Sciences, vol 30. Springer, Berlin. (Second Extended Edition 1997)

  • Kohonen T, Kangas J, Laaksonen J, Torkkola K (1992) LVQ\_PAK: a program package for the correct application of Learning Vector Quantization algorithms. In: Proceedings of IJCNN’92, international joint conference on neural networks, vol I. IEEE Service Center, Piscataway, pp 725–730

  • Martinetz T, Schulten K (1994) Topology representing networks. Neural Netw 7(2)

  • Martinetz TM, Berkovich SG, Schulten KJ (1993) ’Neural-gas’ network for vector quantization and its application to time-series prediction. IEEE Trans Neural Netw 4(4):558–569

    Article  Google Scholar 

  • Mercer J (1909) Functions of positive and negative type and their connection with the theory of integral equations. Philos Trans R Soc Lond A 209:415–446

    Article  MATH  Google Scholar 

  • Micchelli C, Xu Y, Zhang H (2006) Universal kernels. J Mach Learn Res 7(26):051–2667

    MathSciNet  Google Scholar 

  • Nova D, Estévez P (2013) A review of learning vector quantization classifiers. Neural Comput Appl. doi:10.1007/s00521-013-1535-3

  • Qin A, Suganthan P (2004) A novel kernel prototype-based learning algorithm. In: Proceedings of the 17th international conference on pattern recognition (ICPR’04), vol 4, pp 621–624

  • Sachs L (1992) Angewandte statistik, 7th edn. Springer, Berlin

  • Sato A, Tsukumo J (1994) A criterion for training reference vectors and improved vector quantization. In: Proceedings of ICNN’94, international conference on neural networks. IEEE Service Center, Piscataway, pp 161–166

  • Sato A, Yamada K (1996) Generalized learning vector quantization. In: Touretzky DS, Mozer MC, Hasselmo ME (eds) Advances in neural information processing systems, vol 8. Proceedings of the 1995 conference. MIT Press, Cambridge, pp 423–429

  • Sato A, Yamada K (1995) A proposal of generalized learning vector quantization. Tech Rep IEICE 95(346):161–166

    Google Scholar 

  • Schatten R (1950) A theory of cross-spaces. Annals of Mathematics Studies, vol 26. Princeton University Press, Princeton

  • Schleif FM, Hammer B, Villmann T (2007) Margin-based active learning for LVQ networks. Neurocomputing 70(7–9):1215–1224

  • Schleif FM, Villmann T, Hammer B, Schneider P (2011) Efficient kernelized prototype based classification. Int J Neural Syst 21(6):443–457

    Article  Google Scholar 

  • Schleif FM, Villmann T, Kostrzewa M, Hammer B, Gammerman A (2009) Cancer informatics by prototype networks in mass spectrometry. Artif Intell Med 45(2–3):215–228

    Article  Google Scholar 

  • Schölkopf B, Smola A (2002) Learning with kernels. MIT Press, Cambridge

    Google Scholar 

  • Schneider P, Bunte K, Stiekema H, Hammer B, Villmann T, Biehl M (2010) Regularization in matrix relevance learning. IEEE Trans Neural Netw 21(5):831–840

    Article  Google Scholar 

  • Schneider P, Hammer B, Biehl M (2009a) Adaptive relevance matrices in learning vector quantization. Neural Comput 21:3532–3561

  • Schneider P, Hammer B, Biehl M (2009b) Distance learning in discriminative vector quantization. Neural Comput 21:2942–2969

  • Shawe-Taylor J, Cristianini N (2004) Kernel methods for pattern analysis and discovery. Cambridge University Press, Cambridge

    Book  Google Scholar 

  • Steinwart I (2001) On the influence of the kernel on the consistency of support vector machines. J Mach Learn Res 2:67–93

    MathSciNet  Google Scholar 

  • Strickert M (2011) Enhancing M|G|RLVQ by quasi step discriminatory functions using 2nd order training. Machine Learning Reports 5 (MLR-06-2011), pp 5–15. ISSN: 1865–3960. http://www.techfak.uni-bielefeld.de/~fschleif/mlr/mlr_06_2011.pdf

  • Villmann T (2002) Neural maps for faithful data modelling in medicine—state of the art and exemplary applications. Neurocomput 48(1–4):229–250

    Article  MATH  Google Scholar 

  • Villmann T, Geweniger T, Kästner M (2012) Border sensitive fuzzy classification learning in fuzzy vector quantization. Mach Learn Rep 6(MLR-06-2012):23–39. http://www.techfak.uni-bielefeld.de/~fschleif/mlr/mlr_06_2012.pdf. ISSN:1865–3960

  • Villmann T, Haase S (2011) Divergence based vector quantization. Neural Computat 23(5):1343–1392

    Article  MathSciNet  MATH  Google Scholar 

  • Villmann T, Haase S, Kaden M (2014) Kernelized vector quantization in gradient-descent learning. Neurocomputing (in press)

  • Villmann T, Haase S, Kästner M (2013) Gradient based learning in vector quantization using differentiable kernels. In: Estevez P, Principe J, Zegers P (eds) Advances in self-organizing maps: 9th international workshop WSOM 2012 Santiago de Chile. Advances in intelligent systems and computing, vol 198. Springer, Berlin, pp 193–204

  • Villmann T, Merényi E, Hammer B (2003) Neural maps in remote sensing image analysis. Neural Netw 16(3–4):389–403

    Article  Google Scholar 

  • Witoelar A, Gosh A, de Vries J, Hammer B, Biehl M (2010) Window-based example selection in learning vector quantization. Neural Comput 22(11):2924–2961

    Article  MathSciNet  MATH  Google Scholar 

  • Wutzler U, Venner, Villmann T, Decker O, Ott U, Steiner T, Gumz A (2009) Recording of dissimulation and denial in the context of the psychosomatic evaluation at living kidney transplantation using the Minnesota Multiphasic Personality Inventory (MMPI). GMS Psycho Soc Med 6:1–11

  • Yin C, Mu S, Tian S (2012) Using cooperative clustering to solve multiclass problems. In: Wang Y, Li T (eds) Foundation of intelligent systems—proceedings of the sixth international conference on intelligent systems and knowledge engineering (ISKE 2011), Shanghei, China. Advances in intelligent and soft computing, vol. 122. Springer, Berlin, pp 327–334

Download references

Acknowledgments

M. Kaden and M. Riedel acknowledge funding by the European Social Fonds (ESF), Saxony, Germany.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Thomas Villmann.

Additional information

Communicated by I. R. Ruiz.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kaden, M., Riedel, M., Hermann, W. et al. Border-sensitive learning in generalized learning vector quantization: an alternative to support vector machines. Soft Comput 19, 2423–2434 (2015). https://doi.org/10.1007/s00500-014-1496-1

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-014-1496-1

Keywords

Navigation