Abstract
Common problem in data classification is the incompleteness of the data, and not always it is possible to re-acquire the missing values. Another approach is to fill-in missing values using some statistical methods. This however distracts the original data and may lead to over-fit the classifier to the artificially generated values, and in consequence to overestimate the classifier accuracy in Cross Validation tests. In this paper we propose a solution where, for a reference data consisting of complete and incomplete records, complete records serve as a reference data for a standard classifier, while the whole set serves as a reference data for single feature subspaced classifier.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Aeberhard S, Coomans D, De Vel O (1992) Comparison of classifiers in high dimensional settings. Department of Mathematics and Statistics, James Cook University, North Queensland, Australia, Technical report 92, 02 (1992)
Aha D, Kibler D (1991) Instance-based learning algorithms. Mach Learn 6:37–66
Berthold MR, Cebron N, Dill F, Gabriel TR, Kötter T, Meinl T, Ohl P, Thiel K, Wiswedel B (2009) KNIME - the Konstanz information miner: version 2.0 and beyond. SIGKDD Explor Newsl 11(1):26–31. https://doi.org/10.1145/1656274.1656280
Ayres-de Campos D, Bernardes J, Garrido A, Marques-de Sa J, Pereira-Leite L (2000) SisPorto 2.0: a program for automated analysis of cardiotocograms. J Mater-Fetal Med 9(5):311–318
Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7(Jan):1–30
Dua D, Graff C (2017) UCI machine learning repository. http://archive.ics.uci.edu/ml
Eibe F, Hall M, Witten I (2016) The WEKA workbench. Online appendix for data mining: practical machine learning tools and techniques. Morgan Kaufmann
Jain AK, Dubes RC (1988) Algorithms for clustering data. Prentice-Hall, Inc., Upper Saddle River
Khozeimeh F, Alizadehsani R, Roshanzamir M, Khosravi A, Layegh P, Nahavandi S (2017) An expert system for selecting wart treatment method. Comput Biol Med 81:167–175
Mangasarian OL, Street WN, Wolberg WH (1995) Breast cancer diagnosis and prognosis via linear programming. Oper Res 43(4):570–577
Porwik P, Orczyk T, Lewandowski M, Cholewa M (2016) Feature projection k-NN classifier model for imbalanced and incomplete medical data. Biocybern Biomed Eng 36(4):644–656
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Orczyk, T., Doroz, R., Porwik, P. (2020). Combined kNN Classifier for Classification of Incomplete Data. In: Burduk, R., Kurzynski, M., Wozniak, M. (eds) Progress in Computer Recognition Systems. CORES 2019. Advances in Intelligent Systems and Computing, vol 977. Springer, Cham. https://doi.org/10.1007/978-3-030-19738-4_3
Download citation
DOI: https://doi.org/10.1007/978-3-030-19738-4_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-19737-7
Online ISBN: 978-3-030-19738-4
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)