Adaptive Distance Metrics for Nearest Neighbour Classification Based on Genetic Programming

Agapitos, Alexandros; O’Neill, Michael; Brabazon, Anthony

doi:10.1007/978-3-642-37207-0_1

Adaptive Distance Metrics for Nearest Neighbour Classification Based on Genetic Programming

Alexandros Agapitos²¹,
Michael O’Neill²¹ &
Anthony Brabazon²¹

Conference paper

1634 Accesses
6 Citations
1 Altmetric

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 7831))

Abstract

Nearest Neighbour (NN) classification is a widely-used, effective method for both binary and multi-class problems. It relies on the assumption that class conditional probabilities are locally constant. However, this assumption becomes invalid in high dimensions, and severe bias can be introduced, which degrades the performance of the method. The employment of a locally adaptive distance metric becomes crucial in order to keep class conditional probabilities approximately uniform, whereby better classification performance can be attained. This paper presents a locally adaptive distance metric for NN classification based on a supervised learning algorithm (Genetic Programming) that learns a vector of feature weights for the features composing an instance query. Using a weighted Euclidean distance metric, this has the effect of adaptive neighbourhood shapes to query locations, stretching the neighbourhood along the directions for which the class conditional probabilities don’t change much. Initial empirical results on a set of real-world classification datasets showed that the proposed method enhances the generalisation performance of standard NN algorithm, and that it is a competent method for pattern classification as compared to other learning algorithms.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 49.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Agapitos, A., Brabazon, A., O’Neill, M.: Controlling Overfitting in Symbolic Regression Based on a Bias/Variance Error Decomposition. In: Coello Coello, C.A., Cutello, V., Deb, K., Forrest, S., Nicosia, G., Pavone, M. (eds.) PPSN 2012, Part I. LNCS, vol. 7491, pp. 438–447. Springer, Heidelberg (2012)
Chapter Google Scholar
Agapitos, A., O’Neill, M., Brabazon, A.: Evolutionary Learning of Technical Trading Rules without Data-Mining Bias. In: Schaefer, R., Cotta, C., Kołodziej, J., Rudolph, G. (eds.) PPSN XI, Part I. LNCS, vol. 6238, pp. 294–303. Springer, Heidelberg (2010)
Google Scholar
Agapitos, A., O’Neill, M., Brabazon, A., Theodoridis, T.: Maximum Margin Decision Surfaces for Increased Generalisation in Evolutionary Decision Tree Learning. In: Silva, S., Foster, J.A., Nicolau, M., Machado, P., Giacobini, M. (eds.) EuroGP 2011. LNCS, vol. 6621, pp. 61–72. Springer, Heidelberg (2011)
Chapter Google Scholar
Domeniconi, C., Gunopulos, D., Peng, J.: Large margin nearest neighbor classifiers. IEEE Transactions on Neural Networks 16(4), 899–909 (2005)
Article Google Scholar
Domeniconi, C., Peng, J., Gunopulos, D.: Locally adaptive metric nearest-neighbor classification. IEEE Transactions on Pattern Analysis and Machine Intelligence 24(9), 1281–1285 (2002)
Article Google Scholar
Fix, E., Hodges Jr., J.L.: Discriminatory analysis. nonparametric discrimination: Consistency properties. International Statistical Review 57(3), 238–247 (1989)
Article MATH Google Scholar
Frank, A., Asuncion, A.: UCI machine learning repository (2010), http://archive.ics.uci.edu/ml
Friedman, J.H.: Flexible metric nearest neighbour classification. Tech. rep., Department of Statistics. Stanford University (1994)
Google Scholar
Goldberger, J., Roweis, S., Hinton, G., Salakhutdinov, R.: Neighbourhood components analysis. In: Advances in Neural Information Processing Systems 17, pp. 513–520. MIT Press (2004)
Google Scholar
Guo, R., Chakraborty, S.: Bayesian adaptive nearest neighbor. Stat. Anal. Data Min. 3(2), 92–105 (2010)
MathSciNet Google Scholar
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.: The weka data mining software: An update. SIGKDD Explorations 11(1) (2009)
Google Scholar
Hastie, T., Tibshirani, R.: Discriminant adaptive nearest neighbor classification. IEEE Trans. Pattern Anal. Mach. Intell. 18(6), 607–616 (1996)
Article Google Scholar
Kattan, A., Agapitos, A., Poli, R.: Unsupervised Problem Decomposition Using Genetic Programming. In: Esparcia-Alcázar, A.I., Ekárt, A., Silva, S., Dignum, S., Uyar, A.Ş. (eds.) EuroGP 2010. LNCS, vol. 6021, pp. 122–133. Springer, Heidelberg (2010)
Chapter Google Scholar
Mitchell, T.: Machine Learning. McGraw-Hill (1997)
Google Scholar
Peng, J., Heisterkamp, D.R., Dai, H.K.: Lda/svm driven nearest neighbor classification. IEEE Transactions on Neural Networks 14(4), 940–942 (2003)
Article Google Scholar
Poli, R., Langdon, W.B., McPhee, N.F.: A Field Guide to Genetic Programming. Lulu Enterprises, UK Ltd (2008)
Google Scholar
Theodoridis, T., Agapitos, A., Hu, H.: A gaussian groundplan projection area model for evolving probabilistic classifiers. In: Genetic and Evolutionary Computation Conference, GECCO 2011, Dublin, July 12-16. ACM (2011)
Google Scholar
Trevor, H., Robert, T., Jerome, F.: The Elements of Statistical Learning, 2nd edn. Springer (2009)
Google Scholar
Tuite, C., Agapitos, A., O’Neill, M., Brabazon, A.: A Preliminary Investigation of Overfitting in Evolutionary Driven Model Induction: Implications for Financial Modelling. In: Di Chio, C., Brabazon, A., Di Caro, G.A., Drechsler, R., Farooq, M., Grahl, J., Greenfield, G., Prins, C., Romero, J., Squillero, G., Tarantino, E., Tettamanzi, A.G.B., Urquhart, N., Uyar, A.Ş. (eds.) EvoApplications 2011, Part II. LNCS, vol. 6625, pp. 120–130. Springer, Heidelberg (2011)
Chapter Google Scholar
Tuite, C., Agapitos, A., O’Neill, M., Brabazon, A.: Early stopping criteria to counteract overfitting in genetic programming. In: Genetic and Evolutionary Computation Conference, GECCO 2011, Dublin, July 12-16. ACM (2011)
Google Scholar
Wang, J., Neskovic, P., Cooper, L.N.: Improving nearest neighbor rule with a simple adaptive distance measure. Pattern Recogn. Lett. 28(2), 207–213 (2007)
Article Google Scholar
Weinberger, K.Q., Saul, L.K.: Distance metric learning for large margin nearest neighbor classification. J. Mach. Learn. Res. 10, 207–244 (2009)
MATH Google Scholar
Zhang, G.-J., Du, J.-X., Huang, D.-S., Lok, T.-M., Lyu, M.R.: Adaptive Nearest Neighbor Classifier Based on Supervised Ellipsoid Clustering. In: Wang, L., Jiao, L., Shi, G., Li, X., Liu, J. (eds.) FSKD 2006. LNCS (LNAI), vol. 4223, pp. 582–585. Springer, Heidelberg (2006)
Chapter Google Scholar
Zhang, Y., Zhang, M.: A multiple-output program tree structure in genetic programming. In: Mckay, R.I., Cho, S.B. (eds.) Proceedings of the Second Asian-Pacific Workshop on Genetic Programming, Cairns, Australia, p. 12
Google Scholar

Download references

Author information

Authors and Affiliations

Financial Mathematics and Computation Research Cluster, Complex and Adaptive Systems Laboratory, University College Dublin, Ireland
Alexandros Agapitos, Michael O’Neill & Anthony Brabazon

Authors

Alexandros Agapitos
View author publications
You can also search for this author in PubMed Google Scholar
Michael O’Neill
View author publications
You can also search for this author in PubMed Google Scholar
Anthony Brabazon
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Institute of Computing Science, Poznan University of Technology, Piotowo 2, 60-965, Poznań, Poland
Krzysztof Krawiec
School of Computer Science, The University of Birmingham, B15 2TT, Edgbaston, Birmingham, UK
Alberto Moraglio
Geisel School of Medicine, Dartmouth College, 03755, Hanover, NH, USA
Ting Hu
Department of Computer Engineering, Istanbul Technical University, 34469, Masla, Istanbul, Turkey
A. Şima Etaner-Uyar
Institute of Computer Graphics and Algorithms, Vienna University of Technology, 1040, Vienna, Austria
Bin Hu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Agapitos, A., O’Neill, M., Brabazon, A. (2013). Adaptive Distance Metrics for Nearest Neighbour Classification Based on Genetic Programming. In: Krawiec, K., Moraglio, A., Hu, T., Etaner-Uyar, A.Ş., Hu, B. (eds) Genetic Programming. EuroGP 2013. Lecture Notes in Computer Science, vol 7831. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-37207-0_1

Download citation

DOI: https://doi.org/10.1007/978-3-642-37207-0_1
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-37206-3
Online ISBN: 978-3-642-37207-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics