Skip to main content

Weighted Hamming Metric and KNN Classification of Nominal-Continuous Data

  • Conference paper
  • First Online:
Computational Science – ICCS 2023 (ICCS 2023)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14074))

Included in the following conference series:

  • 668 Accesses

Abstract

The purpose of the article is to develop a new metric learning algorithm for combination of continuous and nominal data. We start with Euclidean metric for continuous and Hamming metric for nominal part of data. The impact of specific feature is modeled with corresponding weight in the metric definition. A new algorithm for automatic weights detection is proposed. The weighted metric is then used in the standard knn classification algorithm. Series of numerical experiments show that the algorithm can successfully classify raw, non-normalized data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Asuncion, A., Newman, D.J.: UCI machine learning repository (2007). http://www.ics.uci.edu/~mlearn/MLRepository.html

  2. Bellet, A., Habrard, A., Sebban, M.: Metric learning. Springer Cham (2015). https://doi.org/10.1007/978-3-031-01572-4

    Article  Google Scholar 

  3. Davis, J.V., Kulis, B., Jain, P., Sra, S., Dhillon, I.S.: Information-theoretic metric learning. In: Proceedings of the 24th International Conference on Machine Learning, ICML 2007, pp. 209–216. Association for Computing Machinery, New York (2007). https://doi.org/10.1145/1273496.1273523

  4. Denisiuk, A., Grabowski, M.: Embedding of the hamming space into a sphere with weighted quadrance metric and c-means clustering of nominal-continuous data. Intell. Data Anal. 22(6), 1297001314 (2018). https://doi.org/10.3233/IDA-173645

    Article  Google Scholar 

  5. Fawcett, T.: An introduction to roc analysis. Pattern Recogn. Lett. 27(8), 861–874 (2006). rOC Analysis in Pattern Recognition

    Article  MathSciNet  Google Scholar 

  6. Goldberger, J., Hinton, G.E., Roweis, S., Salakhutdinov, R.R.: Neighbourhood components analysis. In: Saul, L., Weiss, Y., Bottou, L. (eds.) Advances in Neural Information Processing Systems, vol. 17, pp. 513–520. MIT Press (2004)

    Google Scholar 

  7. Karayiannis, N.B., Randolph-Gips, M.M.: Non-euclidean c-means clustering algorithms. Intell. Data Anal. 7(5), 405–425 (2003)

    Article  Google Scholar 

  8. Kaufman, L., Rousseeuw, P.J.: Finding Groups in Data: An Introduction to Cluster Analysis, Wiley Series in Probability and Statistics, vol. 344. John Wiley (2008). https://doi.org/10.1002/9780470316801

  9. Kulis, B.: Metric learning: a survey. Found. Trends Mach. Learn. 5(4), 287–364 (2013). https://doi.org/10.1561/2200000019

  10. Liaw, A., Wiener, M.: Classification and regression by randomforest. R News 2(3), 18–22 (2002)

    Google Scholar 

  11. Meyer, D., Dimitriadou, E., Hornik, K., Weingessel, A., Leisch, F.: e1071: Misc Functions of the Department of Statistics, Probability Theory Group (Formerly: E1071), TU Wien (2022)

    Google Scholar 

  12. Norouzi, M., Fleet, D.J., Salakhutdinov, R.R.: Hamming distance metric learning. In: Pereira, F., Burges, C., Bottou, L., Weinberger, K. (eds.) Advances in Neural Information Processing Systems, vol. 25, pp. 1061–1069. Curran Associates, Inc. (2012)

    Google Scholar 

  13. Robin, X., et al.: proc: an open-source package for r and s+ to analyze and compare roc curves. BMC Bioinform. 12, 77 (2011)

    Article  Google Scholar 

  14. Schultz, M., Joachims, T.: Learning a distance metric from relative comparisons. In: Thrun, S., Saul, L., Schölkopf, B. (eds.) Advances in Neural Information Processing Systems, vol. 16, pp. 41–48. MIT Press (2003)

    Google Scholar 

  15. Shi, Y., Bellet, A., Sha, F.: Sparse compositional metric learning. In: Proceedings of the AAAI Conference on Artificial Intelligence 28(1) (June 2014). https://doi.org/10.1609/aaai.v28i1.8968

  16. Weinberger, K.Q., Saul, L.K.: Distance metric learning for large margin nearest neighbor classification. J. Mach. Learn. Res. 10(9), 207–244 (2009)

    MATH  Google Scholar 

  17. Xing, E., Jordan, M., Russell, S.J., Ng, A.: Distance metric learning with application to clustering with side-information. In: Becker, S., Thrun, S., Obermayer, K. (eds.) Advances in Neural Information Processing Systems, vol. 15, pp. 521–528. MIT Press (2002)

    Google Scholar 

  18. Zhai, D., et al.: Parametric local multiview hamming distance metric learning. Pattern Recogn. 75, 250–262 (2018). https://doi.org/10.1016/j.patcog.2017.06.018

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Aleksander Denisiuk .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Denisiuk, A. (2023). Weighted Hamming Metric and KNN Classification of Nominal-Continuous Data. In: Mikyška, J., de Mulatier, C., Paszynski, M., Krzhizhanovskaya, V.V., Dongarra, J.J., Sloot, P.M. (eds) Computational Science – ICCS 2023. ICCS 2023. Lecture Notes in Computer Science, vol 14074. Springer, Cham. https://doi.org/10.1007/978-3-031-36021-3_31

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-36021-3_31

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-36020-6

  • Online ISBN: 978-3-031-36021-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics