Fast Supervised Selection of Prototypes for Metric-Based Learning

Belanche, Lluís A.

doi:10.1007/978-3-030-01421-6_55

Lluís A. Belanche¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11140))

Included in the following conference series:

International Conference on Artificial Neural Networks

2703 Accesses

Abstract

A crucial factor for successful learning is the finding of more convenient representations for a problem, such that subsequent processing can be delivered to linear or non-linear modeling methods. Similarity functions are a flexible way to express knowledge about a problem and to capture meaningful relations of data in input space. In this paper we use similarity functions to find an alternative data representation which is then reduced by selecting a subset of relevant prototypes, in a supervised way. The idea is tested in a set of modelling problems, characterized by a mixture of data types and different amounts of missing values. The results demonstrate competitive or better performance than traditional methods in terms of prediction error and sparsity of the representation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
For example, by the presence of missing values, by the feature semantics, etc.
2.
Such variables are increasingly common, especially when they refer to a time periodicity, such as the month in a year.
3.
It is not difficult to check that this is equivalent to the replacement of the missing similarities by the average of the non-missing ones. Therefore, the conjecture is that the missing values, if known, would not change the overall similarity significantly.
4.
This property is not used in this work but it is interesting in other contexts, such as optimization.
5.
The experiments were run on a HP laptop with 2GB of RAM and an Intel(R) Core(TM)2 Duo CPU T7500 at 2.20GHz.
6.
See the caption of Table 1 for a description.

References

Osborne, H., Bridge, D. Models of similarity for case-based reasoning. In: Interdisciplinary Workshop on Similarity and Categorisation, pp. 173–179 (1997)
Google Scholar
Tibshirani, R.: Regression Shrinkage and Selection via the lasso. J. R. Stat. Soc. Ser. B. Wiley 58(1), 26788 (1996)
MathSciNet MATH Google Scholar
Baeza-Yates, R., Ribeiro, B.: Modern Information Retrieval. ACM Press, New York (1999)
Google Scholar
Shawe-Taylor, J., Cristianini, N.: Kernel Methods for Pattern Analysis. Camb. Univ Press, Cambridge (2004)
Book Google Scholar
UCI Machine Learning Repository. http://archive.ics.uci.edu/ml
Pekalska, E.: The Dissimilarity representations in pattern recognition. Concepts, theory and applications. (Ph.D. Thesis) Delft University of Technology (2005)
Google Scholar
Duin, R.P.W., Loog, M., Pekalska, E., Tax, D.M.J.: Feature-based dissimilarity space classification. In: Ünay, D., Çataltepe, Z., Aksoy, S. (eds.) ICPR 2010. LNCS, vol. 6388, pp. 46–55. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-17711-8_5
Chapter Google Scholar
Garain, U.: Prototype reduction using an artificial immune model. Pattern Anal. Appl. 11(3–4), 353–363 (2008)
Article MathSciNet Google Scholar
Gower, J.C.: A general coefficient of similarity and some of its properties. Biometrika 27(4), 857–871 (1971)
Article Google Scholar
Sokal, R.R., Michener, C.D.: Principles of Numerical Taxonomy. W.H. Freeman, San Francisco (1963)
Google Scholar
Dixon, J.K.: Pattern recognition with partly missing data. IEEE Trans. Syst. Man Cybernet. 9, 617–621 (1979)
Article Google Scholar
Gower, J.C., Legendre, P.: Metric and Euclidean Properties of Dissimilarity Coefficients. J. Classification 3, 5–48 (1986)
Article MathSciNet Google Scholar
Pavoine, S., Vallet, J., Dufour, A.B., Gachet, S., Daniel, H.: On the challenge of treating various types of variables: application for improving the measurement of functional diversity. Oikos 118(3), 391–402 (2009)
Article Google Scholar
Caputo, B., Sim, K., Furesjo, F., Smola, A.: Appearance-based object recognition using SVMs: which kernel should I use? In: NIPS Workshop on Statistical methods for Computational Experiments in Visual Processing and Computer Vision (2002)
Google Scholar
van Buuren, S., Groothuis-Oudshoorn, K.: mice: multivariate imputation by chained equations in R. J. Stat. Softw. 45(3), 1–67 (2011)
Article Google Scholar
Ripley, B.: Pattern Recognition and Neural Networks. Camb. Univ Press, Cambridge (1996)
Book Google Scholar
Ravindra Babu, T., Narasimha Murty, M.: Comparison of genetic algorithm based prototype selection schemes. Pattern Recognit. 34, 523–525 (2001)
Article Google Scholar
Belanche, L.l., Hernández, J.: Similarity networks for heterogeneous data. In: Proceedings of the ESANN: European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (2012)
Google Scholar
Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)
Article Google Scholar
Kuncheva, L., Bezdek, J.: Nearest prototype classification: clustering, genetic algorithms, or random search? IEEE Trans. Syst. Man Cybern. Part C 28(1), 160–164 (1998)
Article Google Scholar
Lipowezky, U.: Selection of the optimal prototype subset for 1-NN classification. Pattern Recognit. Lett. 19, 907–918 (1998)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Computer Science Department, Technical University of Catalonia, Jordi Girona, 1-3, 08034, Barcelona, Catalonia, Spain
Lluís A. Belanche

Authors

Lluís A. Belanche
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Lluís A. Belanche .

Editor information

Editors and Affiliations

Czech Academy of Sciences, Prague 8, Czech Republic
Věra Kůrková
Open University of Cyprus, Latsia, Cyprus
Yannis Manolopoulos
CITEC Bielefeld University, Bielefeld, Germany
Barbara Hammer
Democritus University of Thrace, Xanthi, Greece
Lazaros Iliadis
University of Piraeus, Piraeus, Greece
Ilias Maglogiannis

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Belanche, L.A. (2018). Fast Supervised Selection of Prototypes for Metric-Based Learning. In: Kůrková, V., Manolopoulos, Y., Hammer, B., Iliadis, L., Maglogiannis, I. (eds) Artificial Neural Networks and Machine Learning – ICANN 2018. ICANN 2018. Lecture Notes in Computer Science(), vol 11140. Springer, Cham. https://doi.org/10.1007/978-3-030-01421-6_55

Download citation

DOI: https://doi.org/10.1007/978-3-030-01421-6_55
Published: 26 September 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-01420-9
Online ISBN: 978-3-030-01421-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics