Skip to main content

Data Characterization for Effective Prototype Selection

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 3523))

Abstract

The Nearest Neighbor classifier is one of the most popular supervised classification methods. It is very simple, intuitive and accurate in a great variety of real-world applications. Despite its simplicity and effectiveness, practical use of this rule has been historically limited due to its high storage requirements and the computational costs involved, as well as the presence of outliers. In order to overcome these drawbacks, it is possible to employ a suitable prototype selection scheme, as a way of storage and computing time reduction and it usually provides some increase in classification accuracy. Nevertheless, in some practical cases prototype selection may even produce a degradation of the classifier effectiveness. From an empirical point of view, it is still difficult to know a priori when this method will provide an appropriate behavior. The present paper tries to predict how appropriate a prototype selection algorithm will result when applied to a particular problem, by characterizing data with a set of complexity measures.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Chang, C.-L.: Finding prototypes for nearest neighbor classifiers. IEEE Trans. on Computers 23, 1179–1184 (1974)

    Article  MATH  Google Scholar 

  2. Chavez, E., Navarro, G., Baeza-Yates, R.A., Marroquin, J.L.: Searching in metric spaces. ACM Computing Surveys 33, 273–321 (2001)

    Article  Google Scholar 

  3. Cover, T.M., Hart, P.E.: Nearest neighbor pattern classification. IEEE Trans. on Information Theory 13, 21–27 (1967)

    Article  MATH  Google Scholar 

  4. Dasarathy, B.V.: Minimal consistent subset (MCS) identification for optimal nearest neighbor decision systems design. IEEE Trans. on Systems, Man, and Cybernetics 24, 511–517 (1994)

    Article  Google Scholar 

  5. Devijver, P.A., Kittler, J.: Pattern Recognition: A Statistical Approach. Prentice Hall, Englewood Cliffs (1982)

    MATH  Google Scholar 

  6. Hart, P.E.: The condensed nearest neighbor rule. IEEE Trans. on Information Theory 14, 515–516 (1968)

    Article  Google Scholar 

  7. Ho, T.-K., Basu, M.: Complexity measures of supervised classification problems. IEEE Trans. on Pattern Analysis and Machine Intelligence 24, 289–300 (2002)

    Article  Google Scholar 

  8. Bernardo, E., Ho, T.-K.: On classifier domain of competence. In: Proc. 17th. Int. Conf. on Pattern Recognition 1, Cambridge, UK, pp. 136–139 (2004)

    Google Scholar 

  9. Kim, S.-W., Oommen, B.J.: Enhancing prototype reduction schemes with LVQ3-type algorithms. Pattern Recognition 36, 1083–1093 (2003)

    Article  MATH  Google Scholar 

  10. Kuncheva, L.I.: Editing for the k-nearest neighbors rule by a genetic algorithm. Pattern Recognition Letters 16, 809–814 (1995)

    Article  Google Scholar 

  11. Mollineda, R.A., Ferri, F.J., Vidal, E.: An efficient prototype merging strategy for the condensed 1-NN rule through class-conditional hierarchical clustering. Pattern Recognition 35, 2771–2782 (2002)

    Article  MATH  Google Scholar 

  12. Ritter, G.L., Woodruff, H.B., Lowry, S.R., Isenhour, T.L.: An algorithm for a selective nearest neighbour decision rule. IEEE Trans. on Information Theory 21, 665–669 (1975)

    Article  MATH  Google Scholar 

  13. Tomek, I.: An experiment with the edited nearest neighbor rule. IEEE Trans. on Systems, Man and Cybernetics 6, 448–452 (1976)

    Article  MATH  MathSciNet  Google Scholar 

  14. Wilson, D.L.: Asymptotic properties of nearest neighbor rules using edited data sets. IEEE Trans. on Systems, Man and Cybernetics 2, 408–421 (1972)

    Article  MATH  Google Scholar 

  15. Wilson, D.R., Martinez, T.R.: Reduction techniques for instance-based learning algorithms. Machine Learning 38, 257–286 (2000)

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Mollineda, R.A., Sánchez, J.S., Sotoca, J.M. (2005). Data Characterization for Effective Prototype Selection. In: Marques, J.S., Pérez de la Blanca, N., Pina, P. (eds) Pattern Recognition and Image Analysis. IbPRIA 2005. Lecture Notes in Computer Science, vol 3523. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11492542_4

Download citation

  • DOI: https://doi.org/10.1007/11492542_4

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-26154-4

  • Online ISBN: 978-3-540-32238-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics