Combining k-Nearest Neighbor and Centroid Neighbor Classifier for Fast and Robust Classification

Chmielnicki, Wiesław

doi:10.1007/978-3-319-32034-2_45

Wiesław Chmielnicki¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9648))

Included in the following conference series:

International Conference on Hybrid Artificial Intelligence Systems

2172 Accesses
3 Citations

Abstract

The k-NN classifier is one of the most known and widely used nonparametric classifiers. The k-NN rule is optimal in the asymptotic case which means that its classification error aims for Bayes error if the number of the training samples approaches infinity. A lot of alternative extensions of the traditional k-NN have been developed to improve the classification accuracy. However, it is also well-known fact that when the number of the samples grows it can become very inefficient because we have to compute all the distances from the testing sample to every sample from the training data set. In this paper, a simple method which addresses this issue is proposed. Combining k-NN classifier with the centroid neighbor classifier improves the speed of the algorithm without changing the results of the original k-NN. In fact usage confusion matrices and excluding outliers makes the resulting algorithm much faster and robust.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Altincay, H.: Improving the k-nearest neighbour rule: using geometrical neighbourhoods and manifold-based metrics. Expert Syst. Wiley Online Libr. 28(4), 391–406 (2011)
Article Google Scholar
Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines (2001). Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm
Chmielnicki, W., Sta̧por, K.: Protein fold recognition with combined SVM-RDA classifier. In: Graña Romay, M., Corchado, E., Garcia Sebastian, M.T. (eds.) HAIS 2010, Part I. LNCS (LNAI), vol. 6076, pp. 162–169. Springer, Heidelberg (2010)
Google Scholar
Chmielnicki, W., Stapor, K.: Investigation of normalization techniques and their impact on a recognition rate in handwritten numeral recognition. Schedae Informaticae 19, 53–77 (2010)
Article Google Scholar
Chmielnicki, W., Stapor, K.: A hybrid discriminative/generative approach to protein fold recognition. Neurocomputing 75(1), 194–198 (2012)
Article Google Scholar
Domeniconi, C., Gunopulos, D., Peng, J.: Large margin nearest neighbor classifiers. IEEE Trans. Neural Netw. 16(4), 899–909 (2005)
Article Google Scholar
Devijver, P.A., Kittler, J.: Pattern Recognition: A Statistical Approach. Prentice Hall, Englewood Cliffs (1982)
MATH Google Scholar
Ding, C.H., Dubchak, I.: Multi-class protein fold recognition using support vector machines and neural networks. Bioinformatics 17, 349–358 (2001)
Article Google Scholar
Dubchak, I., Muchnik, I., Kim, S.H.: Protein folding class predictor for SCOP: approach based on global descriptors. In: Proceedings ISMB (1997)
Google Scholar
Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification, 2nd edn. Wiley, New York (2000)
MATH Google Scholar
Dudani, S.: The distance-weighted k-nearest neighbor rule. IEEE Trans. Syst. Man Cybern. 6(4), 325–327 (1976)
Article Google Scholar
Gou, J., Xiong, T., Kuang, Y.: A novel weighted voting for k-nearest neighbor rule. J. Comput. 6(5), 833–840 (2011)
Article Google Scholar
Gou, J., Du, L., Xiong, T.: Weighted k-nearest centroid neighbor classification. J. Comput. Inf. Syst. 8(2), 851–860 (2012)
Google Scholar
Jiang, Y., Zhou, Z.-H.: Editing training data for kNN classifiers with neural network ensemble. In: Yin, F.-L., Wang, J., Guo, C. (eds.) ISNN 2004. LNCS, vol. 3173, pp. 356–361. Springer, Heidelberg (2004)
Chapter Google Scholar
LeCun, Y., et al.: Comparison of learning algorithms for handwritten digit recognition. In: Fogelman-Soulie, F., Gallinari, P. (eds.) Proceedings of the International Conference on Artificial Neural Networks, France, Nanterre, pp. 53–60 (1995)
Google Scholar
Liu, Y., Yang, Y., Carbonell, J.: Boosting to correct inductive bias in text classification. In: 11th ACM International Conference on Information and Knowledge Management, pp. 348–355 (2002)
Google Scholar
The MNIST database of handwritten digits (2015). http://yann.lecun.com/exdb/mnist/
Petitjean, F., Forestier, G., Webb, G., Nicholson, A., Chen, Y., Keogh, E.: Faster and more accurate classification of time series by exploiting a novel dynamic time warping averaging algorithm. Knowl. Inf. Syst. 1–26 (2015). doi:10.1007/s10115-015-0878-8
Google Scholar
Sanchez, J.S., Marques, A.: An LVQ-based adaptive algorithm for learning from very small codebooks. Neurocomputing 69(7–9), 922–927 (2006)
Article Google Scholar
Shalev-Shwartz, S., Singer, Y., Ng, A.: Online and batch learning of pseudo-metrics. In: Proceedings of the 21st International Conference on Machine Learning, Banff, Canada (2004)
Google Scholar
Silva, P.F.B., Marçal, A.R.S., Almeida da Silva, R.: Evaluation of features for leaf discrimination. In: Kamel, M., Campilho, A. (eds.) ICIAR 2013. LNCS, vol. 7950, pp. 197–204. Springer, Heidelberg (2013)
Chapter Google Scholar
Sun, S., Huang, R.: An adaptive k-nearest neighbor algorithm. In: Proceedings of the 7th International Conference on Fuzzy Systems and Knowledge Discovery, pp. 91–94 (2010)
Google Scholar
Sun, S.: Local within-class accuracies for weighting individual outputs in multiple classifier systems. Pattern Recogn. Lett. 31(2), 119–124 (2010)
Article Google Scholar
Tan, S.: An improved centroid classifier for text categorization. Expert Syst. Appl. 35(1), 279–285 (2008)
Article Google Scholar
UCI Machine Learning Repository (2015). http://archive.ics.uci.edu/ml/datasets.html
Vapnik, V.: The Nature of Statistical Learning Theory. Springer, New York (1995)
Book MATH Google Scholar
Weinberger, K., Saul, L.: Distance metric learning for large margin nearest neighbor classification. J. Mach. Learn. Res. 10, 207–244 (2009)
MATH Google Scholar
Xi, X., Ueno, K., Keogh, E., Lee, D.: Converting nonparametric distance-based classification to anytime algorithms. Pattern Anal. Appl. 11(3–4), 321–336 (2008)
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Faculty of Physics, Astronomy and Applied Computer Science, Jagiellonian University, Kraków, Poland
Wiesław Chmielnicki

Authors

Wiesław Chmielnicki
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wiesław Chmielnicki .

Editor information

Editors and Affiliations

Universidad Pablo de Olavide, Sevilla, Spain
Francisco Martínez-Álvarez
Universidad Pablo de Olavide, Sevilla, Spain
Alicia Troncoso
University of Salamanca, Salamanca, Spain
Héctor Quintián
University of Salamanca, Salamanca, Spain
Emilio Corchado

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chmielnicki, W. (2016). Combining k-Nearest Neighbor and Centroid Neighbor Classifier for Fast and Robust Classification. In: Martínez-Álvarez, F., Troncoso, A., Quintián, H., Corchado, E. (eds) Hybrid Artificial Intelligent Systems. HAIS 2016. Lecture Notes in Computer Science(), vol 9648. Springer, Cham. https://doi.org/10.1007/978-3-319-32034-2_45

Download citation

DOI: https://doi.org/10.1007/978-3-319-32034-2_45
Published: 14 April 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-32033-5
Online ISBN: 978-3-319-32034-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics