Skip to main content

Fast Supervised Selection of Prototypes for Metric-Based Learning

  • Conference paper
  • First Online:
Artificial Neural Networks and Machine Learning – ICANN 2018 (ICANN 2018)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11140))

Included in the following conference series:

  • 2703 Accesses

Abstract

A crucial factor for successful learning is the finding of more convenient representations for a problem, such that subsequent processing can be delivered to linear or non-linear modeling methods. Similarity functions are a flexible way to express knowledge about a problem and to capture meaningful relations of data in input space. In this paper we use similarity functions to find an alternative data representation which is then reduced by selecting a subset of relevant prototypes, in a supervised way. The idea is tested in a set of modelling problems, characterized by a mixture of data types and different amounts of missing values. The results demonstrate competitive or better performance than traditional methods in terms of prediction error and sparsity of the representation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    For example, by the presence of missing values, by the feature semantics, etc.

  2. 2.

    Such variables are increasingly common, especially when they refer to a time periodicity, such as the month in a year.

  3. 3.

    It is not difficult to check that this is equivalent to the replacement of the missing similarities by the average of the non-missing ones. Therefore, the conjecture is that the missing values, if known, would not change the overall similarity significantly.

  4. 4.

    This property is not used in this work but it is interesting in other contexts, such as optimization.

  5. 5.

    The experiments were run on a HP laptop with 2GB of RAM and an Intel(R) Core(TM)2 Duo CPU T7500 at 2.20GHz.

  6. 6.

    See the caption of Table 1 for a description.

References

  1. Osborne, H., Bridge, D. Models of similarity for case-based reasoning. In: Interdisciplinary Workshop on Similarity and Categorisation, pp. 173–179 (1997)

    Google Scholar 

  2. Tibshirani, R.: Regression Shrinkage and Selection via the lasso. J. R. Stat. Soc. Ser. B. Wiley 58(1), 26788 (1996)

    MathSciNet  MATH  Google Scholar 

  3. Baeza-Yates, R., Ribeiro, B.: Modern Information Retrieval. ACM Press, New York (1999)

    Google Scholar 

  4. Shawe-Taylor, J., Cristianini, N.: Kernel Methods for Pattern Analysis. Camb. Univ Press, Cambridge (2004)

    Book  Google Scholar 

  5. UCI Machine Learning Repository. http://archive.ics.uci.edu/ml

  6. Pekalska, E.: The Dissimilarity representations in pattern recognition. Concepts, theory and applications. (Ph.D. Thesis) Delft University of Technology (2005)

    Google Scholar 

  7. Duin, R.P.W., Loog, M., Pekalska, E., Tax, D.M.J.: Feature-based dissimilarity space classification. In: Ünay, D., Çataltepe, Z., Aksoy, S. (eds.) ICPR 2010. LNCS, vol. 6388, pp. 46–55. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-17711-8_5

    Chapter  Google Scholar 

  8. Garain, U.: Prototype reduction using an artificial immune model. Pattern Anal. Appl. 11(3–4), 353–363 (2008)

    Article  MathSciNet  Google Scholar 

  9. Gower, J.C.: A general coefficient of similarity and some of its properties. Biometrika 27(4), 857–871 (1971)

    Article  Google Scholar 

  10. Sokal, R.R., Michener, C.D.: Principles of Numerical Taxonomy. W.H. Freeman, San Francisco (1963)

    Google Scholar 

  11. Dixon, J.K.: Pattern recognition with partly missing data. IEEE Trans. Syst. Man Cybernet. 9, 617–621 (1979)

    Article  Google Scholar 

  12. Gower, J.C., Legendre, P.: Metric and Euclidean Properties of Dissimilarity Coefficients. J. Classification 3, 5–48 (1986)

    Article  MathSciNet  Google Scholar 

  13. Pavoine, S., Vallet, J., Dufour, A.B., Gachet, S., Daniel, H.: On the challenge of treating various types of variables: application for improving the measurement of functional diversity. Oikos 118(3), 391–402 (2009)

    Article  Google Scholar 

  14. Caputo, B., Sim, K., Furesjo, F., Smola, A.: Appearance-based object recognition using SVMs: which kernel should I use? In: NIPS Workshop on Statistical methods for Computational Experiments in Visual Processing and Computer Vision (2002)

    Google Scholar 

  15. van Buuren, S., Groothuis-Oudshoorn, K.: mice: multivariate imputation by chained equations in R. J. Stat. Softw. 45(3), 1–67 (2011)

    Article  Google Scholar 

  16. Ripley, B.: Pattern Recognition and Neural Networks. Camb. Univ Press, Cambridge (1996)

    Book  Google Scholar 

  17. Ravindra Babu, T., Narasimha Murty, M.: Comparison of genetic algorithm based prototype selection schemes. Pattern Recognit. 34, 523–525 (2001)

    Article  Google Scholar 

  18. Belanche, L.l., Hernández, J.: Similarity networks for heterogeneous data. In: Proceedings of the ESANN: European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (2012)

    Google Scholar 

  19. Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)

    Article  Google Scholar 

  20. Kuncheva, L., Bezdek, J.: Nearest prototype classification: clustering, genetic algorithms, or random search? IEEE Trans. Syst. Man Cybern. Part C 28(1), 160–164 (1998)

    Article  Google Scholar 

  21. Lipowezky, U.: Selection of the optimal prototype subset for 1-NN classification. Pattern Recognit. Lett. 19, 907–918 (1998)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lluís A. Belanche .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Belanche, L.A. (2018). Fast Supervised Selection of Prototypes for Metric-Based Learning. In: Kůrková, V., Manolopoulos, Y., Hammer, B., Iliadis, L., Maglogiannis, I. (eds) Artificial Neural Networks and Machine Learning – ICANN 2018. ICANN 2018. Lecture Notes in Computer Science(), vol 11140. Springer, Cham. https://doi.org/10.1007/978-3-030-01421-6_55

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-01421-6_55

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-01420-9

  • Online ISBN: 978-3-030-01421-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics