Abstract
Learning vector quantization (LVQ) algorithms as powerful classifier models for class discrimination of vectorial data belong to the family of prototype-based classifiers with a learning scheme based on Hebbian learning as a widely accepted neuronal learning paradigm. Those classifier approaches estimate the class distribution and generate from this a class decision for vectors to be classified. The estimation can be done by the determination of class-typical sensitive prototypes inside the class distribution area like in LVQ or by detection of the class borders for class discrimination as preferred by support vector machines (SVMs). Both strategies provide advantages and disadvantages depending on the given classification task. Whereas LVQs are very intuitive and usually process the data during learning in the data space, frequently equipped with variants of the Euclidean metric, SVMs implicitly map the data into a high-dimensional kernel-induced feature space for better separation. In this Hilbert space, the inner product is compliant to the kernel. However, this implicit mapping makes a vivid interpretation more difficult. As an alternative, we propose in this paper two modifications of LVQ to make it comparable to SVM: first border-sensitive learning is introduced to achieve border-responsible prototypes comparable with support vectors in SVM. Second, kernel distances for differentiable kernels are considered, such that prototype learning takes place in a metric space isomorphic to the feature mapping space of SVM. Combination of both features gives a powerful prototype-based classifier while keeping the easy interpretation and the intuitive Hebbian learning scheme of LVQ.
Similar content being viewed by others
References
Aronszajn N (1950) Theory of reproducing kernels. Trans Am Math Soc 68:337–404
Barthel H, Villmann T, Hermann W, Hesse S, Kühn HJ, Wagner A, Kluge R (2001) Different patterns of brain glucose consumption in Wilsons disease. Zeitschrift für Gastroenterologie 39:241
Bengio Y (2009) Learning deep architectures for AI. Found Trends Mach Learn 2(1):1–127
Biehl M, Hammer B, Villmann T (2014) Distance measures for prototype based classification. In: Petkov N (ed) Proceedings of the international workshop on brain-inspired computing 2013 (Cetraro/Italy). Springer, Berlin
Blake C, Merz C (1998) UCI repository of machine learning databases. University of California, Irvine, CA, Department of Information and Computer Science. http://www.ics.edu/mlearn/MLRepository.html
Bunte K, Schneider P, Hammer B, Schleif FM, Villmann T, Biehl M (2012) Limited rank matrix learning, discriminative dimension reduction and visualization. Neural Netw 26(1):159–173
Caruana R, Niculescu-Mizil A (2006) An empirical comparison of supervised learning algorithms. In: Proceedings of the 23rd international conference on machine learning. ACM, New York, pp 161–168
Chang CC, Lin CJ (2011) LIBSVM: a library for support vector machines. ACM Trans Intell Syst Technol 2(3:27):1–27
Crammer K, Gilad-Bachrach R, Navot A, Tishby A (2003) Margin analysis of the LVQ algorithm. In: Becker S, Thrun K, Obermayer K (eds.) Advances in neural information processing (Proc. NIPS 2002), vol 15. MIT Press, Cambridge, pp 462–469
Cristianini N, Shawe-Taylor J (2000) An introduction to support vector machines and other kernel-based learning methods. Cambridge University Press, Cambridge
Duda R, Hart P (1973) Pattern classification and scene analysis. Wiley, New York
Fritzke B (1995) A growing neural gas network learns topologies. In: Tesauro G, Touretzky DS, Leen TK (eds) Advances in neural information processing systems, vol 7. MIT Press, Cambridge, pp 625–632
Günther P, Villmann T, Hermann W (2011) Event related potentials and cognitive evaluation in Wilson’s disease with and without neurological manifestation. J Neurol Sci [Turkish] 28(1):79–85
Gu Z, Shao M, Li L, Fu Y (2012) Discriminative metric: Schatten norms vs. vector norm. In: Proceedings of the 21st international conference on pattern recognition (ICPR 2012), pp 1213–1216
Hammer B, Nebel D, Riedel M, Villmann T (2014) Generative versus discriminative prototype based classification. In: Villmann T, Schleif FM, Kaden M, Lange M (eds) Advances in self-organizing maps and learning vector quantization: proceedings of 10th international workshop WSOM 2014, Mittweida. Advances in intelligent systems and computing, vol 295. Springer, Berlin, pp 123–132
Hammer B, Strickert M, Villmann T (2005) On the generalization ability of GRLVQ networks. Neural Process Lett 21(2):109–120
Hammer B, Strickert M, Villmann T (2005) Supervised neural gas with general similarity measure. Neural Process Lett 21(1):21–44
Hammer B, Villmann T (2002) Generalized relevance learning vector quantization. Neural Netw 15(8–9):1059–1068
Hasenjäger M, Ritter H (1998) Active learning with local models. Neural Process Lett 7:107–117
Hasenjäger M, Ritter H, Obermayer K (1999) Active learning in self-organizing maps. In: Oja E, Kaski S (eds) Kohonen maps. Elsevier, Amsterdam, pp 57–70
Hastie T, Tibshirani R, Friedman J (2001) The elements of statistical learning. Springer, Heidelberg
Haykin S (1994) Neural networks—a comprehensive foundation. IEEE Press, New York
Hermann W, Barthel H, Hesse S, Grahmann F, Kühn HJ, Wagner A, Villmann T (2002) Comparison of clinical types of Wilson’s disease and glucose metabolism in extrapyramidal motor brain regions. J Neurol 249(7):896–901
Hermann W, Günther P, Wagner A, Villmann T (2005) Klassifikation des Morbus Wilson auf der Basis neurophysiologischer Parameter. Der Nervenarzt 76:733–739
Hermann W, Villmann T, Grahmann F, Kühn H, Wagner A (2003) Investigation of fine motoric disturbances in Wilson’s disease. Neurol Sci 23(6):279–285
Hermann W, Villmann T, Wagner A (2003) Elektrophysiologisches Schädigungsprofil von Patienten mit einem Morbus Wilson’. Der Nervenarzt 74(10):881–887
Hermann W, Wagner A, Kühn HJ, Grahmann F, Villmann T (2005) Classification of fine-motoric disturbances in Wilson’s disease using artificial neural networks. Acta Neurologica Scandinavia 111(6):400–406
Herrmann M, Bauer HU, Der R (1994) The ’perceptual magnet’ effect: a model based on self-organizing feature maps. In: Smith LS, Hancock PJB (eds) Neural computation and psychology. Springer, Stirling, pp 107–116
Horn R, Johnson C (2013) Matrix analysis, 2nd edn. Cambridge University Press, Cambridge
Kaden M, Hermann W, Villmann T (2014) Optimization of general statistical accuracy measures for classification based on learning vector quantization. In: Verleysen M (ed) Proceedings of European symposium on artificial neural networks, computational intelligence and machine learning (ESANN’2014). i6doc.com, Louvain-La-Neuve, Belgium, pp 47–52
Kaden M, Lange M, Nebel D, Riedel M, Geweniger T, Villmann T (2014) Aspects in classification learning—review of recent developments in learning vector quantization. Found Comput Decis Sci 39(2):79–105
Klingner M, Hellbach S, Riedel M, Kaden M, Villmann T, Böhme HJ (2014) RFSOM—extending self-organizing feature maps with adaptive metrics to combine spatial and textural features for body pose estimation. In: Villmann T, Schleif FM, Kaden M, Lange M (eds) Advances in self-organizing maps and learning vector quantization: proceedings of 10th international workshop WSOM 2014, Mittweida. Advances in intelligent systems and computing, vol 295. Springer, Berlin, pp 157–166
Kohonen T (1986) Learning vector quantization for pattern recognition. Report TKK-F-A601, Helsinki University of Technology, Espoo, Finland
Kohonen T (1990) Improved versions of learning vector quantization. In: Proceedings of IJCNN-90, international joint conference on neural networks, San Diego, vol I. IEEE Service Center, Piscataway, pp 545–550
Kohonen T (1995) Self-organizing maps. Springer Series in Information Sciences, vol 30. Springer, Berlin. (Second Extended Edition 1997)
Kohonen T, Kangas J, Laaksonen J, Torkkola K (1992) LVQ\_PAK: a program package for the correct application of Learning Vector Quantization algorithms. In: Proceedings of IJCNN’92, international joint conference on neural networks, vol I. IEEE Service Center, Piscataway, pp 725–730
Martinetz T, Schulten K (1994) Topology representing networks. Neural Netw 7(2)
Martinetz TM, Berkovich SG, Schulten KJ (1993) ’Neural-gas’ network for vector quantization and its application to time-series prediction. IEEE Trans Neural Netw 4(4):558–569
Mercer J (1909) Functions of positive and negative type and their connection with the theory of integral equations. Philos Trans R Soc Lond A 209:415–446
Micchelli C, Xu Y, Zhang H (2006) Universal kernels. J Mach Learn Res 7(26):051–2667
Nova D, Estévez P (2013) A review of learning vector quantization classifiers. Neural Comput Appl. doi:10.1007/s00521-013-1535-3
Qin A, Suganthan P (2004) A novel kernel prototype-based learning algorithm. In: Proceedings of the 17th international conference on pattern recognition (ICPR’04), vol 4, pp 621–624
Sachs L (1992) Angewandte statistik, 7th edn. Springer, Berlin
Sato A, Tsukumo J (1994) A criterion for training reference vectors and improved vector quantization. In: Proceedings of ICNN’94, international conference on neural networks. IEEE Service Center, Piscataway, pp 161–166
Sato A, Yamada K (1996) Generalized learning vector quantization. In: Touretzky DS, Mozer MC, Hasselmo ME (eds) Advances in neural information processing systems, vol 8. Proceedings of the 1995 conference. MIT Press, Cambridge, pp 423–429
Sato A, Yamada K (1995) A proposal of generalized learning vector quantization. Tech Rep IEICE 95(346):161–166
Schatten R (1950) A theory of cross-spaces. Annals of Mathematics Studies, vol 26. Princeton University Press, Princeton
Schleif FM, Hammer B, Villmann T (2007) Margin-based active learning for LVQ networks. Neurocomputing 70(7–9):1215–1224
Schleif FM, Villmann T, Hammer B, Schneider P (2011) Efficient kernelized prototype based classification. Int J Neural Syst 21(6):443–457
Schleif FM, Villmann T, Kostrzewa M, Hammer B, Gammerman A (2009) Cancer informatics by prototype networks in mass spectrometry. Artif Intell Med 45(2–3):215–228
Schölkopf B, Smola A (2002) Learning with kernels. MIT Press, Cambridge
Schneider P, Bunte K, Stiekema H, Hammer B, Villmann T, Biehl M (2010) Regularization in matrix relevance learning. IEEE Trans Neural Netw 21(5):831–840
Schneider P, Hammer B, Biehl M (2009a) Adaptive relevance matrices in learning vector quantization. Neural Comput 21:3532–3561
Schneider P, Hammer B, Biehl M (2009b) Distance learning in discriminative vector quantization. Neural Comput 21:2942–2969
Shawe-Taylor J, Cristianini N (2004) Kernel methods for pattern analysis and discovery. Cambridge University Press, Cambridge
Steinwart I (2001) On the influence of the kernel on the consistency of support vector machines. J Mach Learn Res 2:67–93
Strickert M (2011) Enhancing M|G|RLVQ by quasi step discriminatory functions using 2nd order training. Machine Learning Reports 5 (MLR-06-2011), pp 5–15. ISSN: 1865–3960. http://www.techfak.uni-bielefeld.de/~fschleif/mlr/mlr_06_2011.pdf
Villmann T (2002) Neural maps for faithful data modelling in medicine—state of the art and exemplary applications. Neurocomput 48(1–4):229–250
Villmann T, Geweniger T, Kästner M (2012) Border sensitive fuzzy classification learning in fuzzy vector quantization. Mach Learn Rep 6(MLR-06-2012):23–39. http://www.techfak.uni-bielefeld.de/~fschleif/mlr/mlr_06_2012.pdf. ISSN:1865–3960
Villmann T, Haase S (2011) Divergence based vector quantization. Neural Computat 23(5):1343–1392
Villmann T, Haase S, Kaden M (2014) Kernelized vector quantization in gradient-descent learning. Neurocomputing (in press)
Villmann T, Haase S, Kästner M (2013) Gradient based learning in vector quantization using differentiable kernels. In: Estevez P, Principe J, Zegers P (eds) Advances in self-organizing maps: 9th international workshop WSOM 2012 Santiago de Chile. Advances in intelligent systems and computing, vol 198. Springer, Berlin, pp 193–204
Villmann T, Merényi E, Hammer B (2003) Neural maps in remote sensing image analysis. Neural Netw 16(3–4):389–403
Witoelar A, Gosh A, de Vries J, Hammer B, Biehl M (2010) Window-based example selection in learning vector quantization. Neural Comput 22(11):2924–2961
Wutzler U, Venner, Villmann T, Decker O, Ott U, Steiner T, Gumz A (2009) Recording of dissimulation and denial in the context of the psychosomatic evaluation at living kidney transplantation using the Minnesota Multiphasic Personality Inventory (MMPI). GMS Psycho Soc Med 6:1–11
Yin C, Mu S, Tian S (2012) Using cooperative clustering to solve multiclass problems. In: Wang Y, Li T (eds) Foundation of intelligent systems—proceedings of the sixth international conference on intelligent systems and knowledge engineering (ISKE 2011), Shanghei, China. Advances in intelligent and soft computing, vol. 122. Springer, Berlin, pp 327–334
Acknowledgments
M. Kaden and M. Riedel acknowledge funding by the European Social Fonds (ESF), Saxony, Germany.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by I. R. Ruiz.
Rights and permissions
About this article
Cite this article
Kaden, M., Riedel, M., Hermann, W. et al. Border-sensitive learning in generalized learning vector quantization: an alternative to support vector machines. Soft Comput 19, 2423–2434 (2015). https://doi.org/10.1007/s00500-014-1496-1
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-014-1496-1