Improving the Behavior of the Nearest Neighbor Classifier against Noisy Data with Feature Weighting Schemes

Sáez, José A.; Derrac, Joaquín; Luengo, Julián; Herrera, Francisco

doi:10.1007/978-3-319-07617-1_52

José A. Sáez²⁵,
Joaquín Derrac²⁶,
Julián Luengo²⁷ &
…
Francisco Herrera²⁵

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8480))

Included in the following conference series:

International Conference on Hybrid Artificial Intelligence Systems

2015 Accesses
1 Citations

Abstract

The Nearest Neighbor rule is one of the most successful classifiers in machine learning but it is very sensitive to noisy data, which may cause its performance to deteriorate. This contribution proposes a new feature weighting classifier that tries to reduce the influence of noisy features. The computation of the weights is based on combining imputation methods and non-parametrical statistical tests. The results obtained show that our proposal can improve the performance of the Nearest Neighbor classifier dealing with different types of noisy data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Aha, D.W., Kibler, D., Albert, M.K.: Instance-based learning algorithms. Machine Learning 6, 37–66 (1991)
Google Scholar
Alcalá-Fdez, J., Fernández, A., Luengo, J., Derrac, J., García, S., Sánchez, L., Herrera, F.: Keel data-mining software tool: Data set repository, integration of algorithms and experimental analysis framework. Journal of Multiple-Valued Logic and Soft Computing 17(2-3) (2011)
Google Scholar
Batista, G.E.A.P.A., Monard, M.C.: An analysis of four missing data treatment methods for supervised learning. Applied Artificial Intelligence 17(5-6), 519–533 (2003)
Article Google Scholar
Cover, T., Hart, P.: Nearest neighbor pattern classification. IEEE Transactions on Information Theory 13, 21–27 (1967)
Article MATH Google Scholar
Khosla, R., Howlett, R.J., Jain, L.C. (eds.): KES 2005. LNCS (LNAI), vol. 3683. Springer, Heidelberg (2005)
Google Scholar
Ślęzak, D., Yao, J., Peters, J.F., Ziarko, W.P., Hu, X. (eds.): RSFDGrC 2005. LNCS (LNAI), vol. 3642. Springer, Heidelberg (2005)
Google Scholar
Moreno-Torres, J.G., Sáez, J.A., Herrera, F.: Study on the Impact of Partition-Induced Dataset Shift on k-fold Cross-Validation. IEEE Transactions on Neural Networks and Learning Systems 23(8), 1304–1312 (2012)
Article Google Scholar
Moreno-Torres, J.G., Raeder, T., Alaiz-Rodríguez, R., Chawla, N.V., Herrera, F.: A unifying view on dataset shift in classification. Pattern Recognition 45(1), 521–530 (2012)
Article Google Scholar
Paredes, R., Vidal, E.: Learning weighted metrics to minimize nearest-neighbor classification error. IEEE Transactions on Pattern Analysis and Machine Intelligence 28(7), 1100–1110 (2006)
Article Google Scholar
Sáez, J., Luengo, J., Herrera, F.: Predicting noise filtering efficacy with data complexity measures for nearest neighbor classification. Pattern Recognition 46(1), 355–364 (2013)
Article Google Scholar
Smirnov, N.V.: Estimate of deviation between empirical distribution functions in two independent samples. Bulletin of Moscow University 2, 3–16 (1939) (in Russian)
Google Scholar
Wettschereck, D., Aha, D.W., Mohri, T.: A review and empirical evaluation of feature weighting methods for a class of lazy learning algorithms. Artificial Intelligence Review 11, 273–314 (1997)
Article Google Scholar
Wilcoxon, F.: Individual comparisons by ranking methods. Biometrics Bulletin 1(6), 80–83 (1945)
Article Google Scholar
Zhu, X., Wu, X.: Class Noise vs. Attribute Noise: A Quantitative Study. Artificial Intelligence Review 22, 177–210 (2004)
Article MATH Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Artificial Intelligence, University of Granada, CITIC-UGR, Granada, Spain, 18071
José A. Sáez & Francisco Herrera
School of Computer Science & Informatics, Cardiff University, Cardiff, CF24 3AA, United Kingdom
Joaquín Derrac
Department of Civil Engineering, LSI, University of Burgos, Burgos, Spain, 09006
Julián Luengo

Authors

José A. Sáez
View author publications
You can also search for this author in PubMed Google Scholar
Joaquín Derrac
View author publications
You can also search for this author in PubMed Google Scholar
Julián Luengo
View author publications
You can also search for this author in PubMed Google Scholar
Francisco Herrera
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Electrical and Computer Engineering, University of Cyprus, 75 Kallipoleos Avenue, 1678, Nicosia, Cyprus
Marios Polycarpou
Department of Computer Science, University of Sao Paulo at Sao Carlos, Sao Carlos, SP, Brazil
André C. P. L. F. de Carvalho
Department of Electronic Engineering, National Kaohsiung University of Applied Sciences, No. 415, Chien Kung Road, 80778, Kaohsiung, Taiwan
Jeng-Shyang Pan
Wroclaw University of Technology, Wybrzeze Wyspianskiego 27, 50-370, Wroclaw, Poland
Michał Woźniak
University of Salamanca, Plaza de la Merced S/N, 37008 Salamanca, Spain and University of A Coruna, Escuela Universitaria Politecnica, Departamento de Enxeñeria Industrial, A Coruna, Spain
Héctor Quintian
University of Salamanca, Plaza de la Merced S/N, 37008, Salamanca, Spain
Emilio Corchado

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sáez, J.A., Derrac, J., Luengo, J., Herrera, F. (2014). Improving the Behavior of the Nearest Neighbor Classifier against Noisy Data with Feature Weighting Schemes. In: Polycarpou, M., de Carvalho, A.C.P.L.F., Pan, JS., Woźniak, M., Quintian, H., Corchado, E. (eds) Hybrid Artificial Intelligence Systems. HAIS 2014. Lecture Notes in Computer Science(), vol 8480. Springer, Cham. https://doi.org/10.1007/978-3-319-07617-1_52

Download citation

DOI: https://doi.org/10.1007/978-3-319-07617-1_52
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-07616-4
Online ISBN: 978-3-319-07617-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics