Skip to main content

Optimising the Distance Metric in the Nearest Neighbour Algorithm on a Real-World Patient Classification Problem

  • Conference paper
  • First Online:
Methodologies for Knowledge Discovery and Data Mining (PAKDD 1999)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1574))

Included in the following conference series:

  • 1015 Accesses

Abstract

The study develops a new method for finding the optimal non-Euclidean distance metric in the nearest neighbour algorithm. The data used to develop this method is a real world doctor shopper classification problem. A statistical measure derived from Shannon’s information theory – known as mutual information - is used to weight attributes in the distance metric. This weighted distance metric produced a much better agreement rate on a five-class classification task than the Euclidean distance metric (63% versus 51%). The agreement rate increased to 77% and 73% respectively when a genetic algorithm and simulated annealing were used to further optimise the weights. This excellent performance paves the way for the development of a highly accurate system for detecting high risk doctor-shoppers both automatically and efficiently.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Dasarathy, B. V.: 1991, NN (Nearest Neighbour) Norms: NN pattern Classification Techniques, IEEE CS Press, Los Alamitos, Calif.

    Google Scholar 

  • Haykin, S.: 1994, Neural Networks: A Comprehensive Foundation, McMillan, New York.

    MATH  Google Scholar 

  • Holland, J. H.: 1992, Adaptation in Natural and Artificial Systems, MIT Press, Cambridge, MA.

    Google Scholar 

  • Kirkpatrick, S.: 1983, Optimisation by simulated annealing, Science 220, 671–680.

    Article  MathSciNet  Google Scholar 

  • Kirkpatrick, S.: 1984, Optimisation by simulated annealing: Quantitative studies, Journal of Statistical Physics 34, 975–986.

    Article  MathSciNet  Google Scholar 

  • Linsker, R.: 1990, Connectionist modelling and brain function: The developing interface, MIT Press, Cambridge, MA, pp. 351–392.

    Google Scholar 

  • Shannon, C. E. and Weaver, W.: 1949, The Mathematical Theory of Communication, University of Illinois Press, Urbana, IL.

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1999 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

He, H., Hawkins, S. (1999). Optimising the Distance Metric in the Nearest Neighbour Algorithm on a Real-World Patient Classification Problem. In: Zhong, N., Zhou, L. (eds) Methodologies for Knowledge Discovery and Data Mining. PAKDD 1999. Lecture Notes in Computer Science(), vol 1574. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-48912-6_49

Download citation

  • DOI: https://doi.org/10.1007/3-540-48912-6_49

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-65866-5

  • Online ISBN: 978-3-540-48912-2

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics