Abstract
This paper investigates the mathematically appropriate treatment of data density estimators in machine learning approaches, if these estimators rely on data dissimilarity density models. We show exemplarily for two well-known machine learning approaches for classification and data visualization that this dependence is apparently analyzing the respective mathematical models. We show by numerical experiments that data sets generate different data dissimilarity densities depending on the dissimilarity measure in use. Thus an appropriate choice in machine learning models is mandatory to process the data consistently.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
- 2.
The Tecator data set is available at http://lib.stat.cmu.edu/datasets/tecator.
- 3.
The PIMA data set is available at http://www.ics.edu/mlearn/MLRepository.html.
References
Sachs, L.: Angewandte Statistik, 7th edn. Springer, Heidelberg (1992). https://doi.org/10.1007/978-3-540-32161-3
Bishop, C.M.: Pattern Recognition and Machine Learning. Springer, Heidelberg (2006)
Duda, R.O., Hart, P.E.: Pattern Classification and Scene Analysis. Wiley, New York (1973)
Haykin, S.: Neural Networks. A Comprehensive Foundation. Macmillan, New York (1994)
Seo, S., Obermayer, K.: Soft learning vector quantization. Neural Comput. 15, 1589–1604 (2003)
Torkkola, K.: Feature extraction by non-parametric mutual information maximization. J. Mach. Learn. Res. 3, 1415–1438 (2003)
Principe, J.C.: Information Theoretic Learning. Springer, Heidelberg (2010). https://doi.org/10.1007/978-1-4419-1570-2
Bishop, C.M., Svensén, M., Williams, C.K.I.: GTM: the generative topographic mapping. Neural Comput. 10, 215–234 (1998)
Lazebnik, S., Raginsky, M.: Supervised learning of quantizer codebooks by information loss minimization. IEEE Trans. Pattern Anal. Mach. Intell. 31(7), 1294–1309 (2009)
Hinton, G.E., Roweis, S.T.: Stochastic neighbor embedding. In: Advances in Neural Information Processing Systems, vol. 15, pp. 833–840. The MIT Press, Cambridge (2002)
van der Maaten, L., Hinten, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9, 2579–2605 (2008)
Bryson, M.C.: Heavy-tailed distributions: properties and tests. Technometrics 16(1), 61–68 (1974)
de Bodt, C., Mulders, D., Verleysen, M., Lee, J.A.: Perplexity-free \(t\)-SNE and twice Student \(tt\)-SNE. In: Verleysen, M. (ed.) Proceedings of the 26th European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (ESANN 2018), Bruges (Belgium), Louvain-La-Neuve, Belgium, pp. 123–128 (2018). i6doc.com
Nebel, D., Kaden, M., Villmann, A., Villmann, T.: Types of (dis\(-\))similarities and adaptive mixtures thereof for improved classification learning. Neurocomputing 268, 42–54 (2017)
Biehl, M., Hammer, B., Villmann, T.: Prototype-based models in machine learning. Wiley Interdisciplinary Rev.: Cogn. Sci. 7(2), 92–111 (2016)
Cichocki, A., Amari, S.-I.: Families of alpha- beta- and gamma- divergences: flexible and robust measures of similarities. Entropy 12, 1532–1568 (2010)
Villmann, T., Haase, S.: Divergence based vector quantization. Neural Comput. 23(5), 1343–1392 (2011)
Kullback, S., Leibler, R.A.: On information and sufficiency. Ann. Math. Stat. 22, 79–86 (1951)
Shannon, C.E.: A mathematical theory of communication. Bell Syst. Tech. J. 27, 379–432 (1948)
Rényi, A.: On measures of entropy and information. In: Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability, Berkeley. University of California Press (1961)
Rényi, A.: Probability Theory. North-Holland Publishing Company, Amsterdam (1970)
Kohonen, T.: Learning vector quantization. Neural Netw. 1(Suppl. 1), 303 (1988)
Robbins, H., Monro, S.: A stochastic approximation method. Ann. Math. Stat. 22, 400–407 (1951)
Emerencia, A.: Student’s \(t\)-distribution in learning vector quantization. Master’s thesis, Institute for Mathematics and Computing Science, University of Groningen (2009)
Villmann, A., Kaden, M., Saralajew, S., Villmann, T.: Probabilistic learning vector quantization with cross-entropy for probabilistic class assignments in classification learning. In: Rutkowski, L., Scherer, R., Korytkowski, M., Pedrycz, W., Tadeusiewicz, R., Zurada, J.M. (eds.) ICAISC 2018. LNCS (LNAI), vol. 10841, pp. 724–735. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-91253-0_67
Bunte, K., Haase, S., Biehl, M., Villmann, T.: Stochastic neighbor embedding (SNE) for dimension reduction and visualization using arbitrary divergences. Neurocomputing 90(9), 23–45 (2012)
van der Maaten, L.: Learning a parametric embedding by preserving local structure. In: Proceedings of the 12th AISTATS, Clearwater Beach, Fl, pp. 384–391 (2009)
Grassberger, P., Procaccia, I.: Measuring the strangeness of strange attractors. Physica 9D, 189–208 (1983)
Takens, F.: On the numerical determination of the dimension of an attractor. In: Braaksma, B.L.J., Broer, H.W., Takens, F. (eds.) Dynamical Systems and Bifurcations. LNM, vol. 1125, pp. 99–106. Springer, Heidelberg (1985). https://doi.org/10.1007/BFb0075637
Theiler, J.: Satistical precision of dimension estimators. Phys. Rev. A 41, 3038–3051 (1990)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Villmann, T., Kaden, M., Mohannazadeh Bakhtiari, M., Villmann, A. (2019). Appropriate Data Density Models in Probabilistic Machine Learning Approaches for Data Analysis. In: Rutkowski, L., Scherer, R., Korytkowski, M., Pedrycz, W., Tadeusiewicz, R., Zurada, J. (eds) Artificial Intelligence and Soft Computing. ICAISC 2019. Lecture Notes in Computer Science(), vol 11509. Springer, Cham. https://doi.org/10.1007/978-3-030-20915-5_40
Download citation
DOI: https://doi.org/10.1007/978-3-030-20915-5_40
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-20914-8
Online ISBN: 978-3-030-20915-5
eBook Packages: Computer ScienceComputer Science (R0)