Weighted Hamming Metric and KNN Classification of Nominal-Continuous Data

Denisiuk, Aleksander

doi:10.1007/978-3-031-36021-3_31

Aleksander Denisiuk ORCID: orcid.org/0000-0002-7501-7048¹³

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14074))

Included in the following conference series:

International Conference on Computational Science

668 Accesses

Abstract

The purpose of the article is to develop a new metric learning algorithm for combination of continuous and nominal data. We start with Euclidean metric for continuous and Hamming metric for nominal part of data. The impact of specific feature is modeled with corresponding weight in the metric definition. A new algorithm for automatic weights detection is proposed. The weighted metric is then used in the standard knn classification algorithm. Series of numerical experiments show that the algorithm can successfully classify raw, non-normalized data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Asuncion, A., Newman, D.J.: UCI machine learning repository (2007). http://www.ics.uci.edu/~mlearn/MLRepository.html
Bellet, A., Habrard, A., Sebban, M.: Metric learning. Springer Cham (2015). https://doi.org/10.1007/978-3-031-01572-4
Article Google Scholar
Davis, J.V., Kulis, B., Jain, P., Sra, S., Dhillon, I.S.: Information-theoretic metric learning. In: Proceedings of the 24th International Conference on Machine Learning, ICML 2007, pp. 209–216. Association for Computing Machinery, New York (2007). https://doi.org/10.1145/1273496.1273523
Denisiuk, A., Grabowski, M.: Embedding of the hamming space into a sphere with weighted quadrance metric and c-means clustering of nominal-continuous data. Intell. Data Anal. 22(6), 1297001314 (2018). https://doi.org/10.3233/IDA-173645
Article Google Scholar
Fawcett, T.: An introduction to roc analysis. Pattern Recogn. Lett. 27(8), 861–874 (2006). rOC Analysis in Pattern Recognition
Article MathSciNet Google Scholar
Goldberger, J., Hinton, G.E., Roweis, S., Salakhutdinov, R.R.: Neighbourhood components analysis. In: Saul, L., Weiss, Y., Bottou, L. (eds.) Advances in Neural Information Processing Systems, vol. 17, pp. 513–520. MIT Press (2004)
Google Scholar
Karayiannis, N.B., Randolph-Gips, M.M.: Non-euclidean c-means clustering algorithms. Intell. Data Anal. 7(5), 405–425 (2003)
Article Google Scholar
Kaufman, L., Rousseeuw, P.J.: Finding Groups in Data: An Introduction to Cluster Analysis, Wiley Series in Probability and Statistics, vol. 344. John Wiley (2008). https://doi.org/10.1002/9780470316801
Kulis, B.: Metric learning: a survey. Found. Trends Mach. Learn. 5(4), 287–364 (2013). https://doi.org/10.1561/2200000019
Liaw, A., Wiener, M.: Classification and regression by randomforest. R News 2(3), 18–22 (2002)
Google Scholar
Meyer, D., Dimitriadou, E., Hornik, K., Weingessel, A., Leisch, F.: e1071: Misc Functions of the Department of Statistics, Probability Theory Group (Formerly: E1071), TU Wien (2022)
Google Scholar
Norouzi, M., Fleet, D.J., Salakhutdinov, R.R.: Hamming distance metric learning. In: Pereira, F., Burges, C., Bottou, L., Weinberger, K. (eds.) Advances in Neural Information Processing Systems, vol. 25, pp. 1061–1069. Curran Associates, Inc. (2012)
Google Scholar
Robin, X., et al.: proc: an open-source package for r and s+ to analyze and compare roc curves. BMC Bioinform. 12, 77 (2011)
Article Google Scholar
Schultz, M., Joachims, T.: Learning a distance metric from relative comparisons. In: Thrun, S., Saul, L., Schölkopf, B. (eds.) Advances in Neural Information Processing Systems, vol. 16, pp. 41–48. MIT Press (2003)
Google Scholar
Shi, Y., Bellet, A., Sha, F.: Sparse compositional metric learning. In: Proceedings of the AAAI Conference on Artificial Intelligence 28(1) (June 2014). https://doi.org/10.1609/aaai.v28i1.8968
Weinberger, K.Q., Saul, L.K.: Distance metric learning for large margin nearest neighbor classification. J. Mach. Learn. Res. 10(9), 207–244 (2009)
MATH Google Scholar
Xing, E., Jordan, M., Russell, S.J., Ng, A.: Distance metric learning with application to clustering with side-information. In: Becker, S., Thrun, S., Obermayer, K. (eds.) Advances in Neural Information Processing Systems, vol. 15, pp. 521–528. MIT Press (2002)
Google Scholar
Zhai, D., et al.: Parametric local multiview hamming distance metric learning. Pattern Recogn. 75, 250–262 (2018). https://doi.org/10.1016/j.patcog.2017.06.018
Article Google Scholar

Download references

Author information

Authors and Affiliations

University of Warmia and Mazury in Olsztyn, ul. Słoneczna 54, 10-710, Olsztyn, Poland
Aleksander Denisiuk

Authors

Aleksander Denisiuk
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Aleksander Denisiuk .

Editor information

Editors and Affiliations

Czech Technical University in Prague, Prague, Czech Republic
Jiří Mikyška
University of Amsterdam, Amsterdam, The Netherlands
Clélia de Mulatier
AGH University of Science and Technology, Krakow, Poland
Maciej Paszynski
University of Amsterdam, Amsterdam, The Netherlands
Valeria V. Krzhizhanovskaya
University of Tennessee at Knoxville, Knoxville, TN, USA
Jack J. Dongarra
University of Amsterdam, Amsterdam, The Netherlands
Peter M.A. Sloot

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Denisiuk, A. (2023). Weighted Hamming Metric and KNN Classification of Nominal-Continuous Data. In: Mikyška, J., de Mulatier, C., Paszynski, M., Krzhizhanovskaya, V.V., Dongarra, J.J., Sloot, P.M. (eds) Computational Science – ICCS 2023. ICCS 2023. Lecture Notes in Computer Science, vol 14074. Springer, Cham. https://doi.org/10.1007/978-3-031-36021-3_31

Download citation

DOI: https://doi.org/10.1007/978-3-031-36021-3_31
Published: 26 June 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-36020-6
Online ISBN: 978-3-031-36021-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Weighted Hamming Metric and KNN Classification of Nominal-Continuous Data