Abstract
Algorithm k-nn is often used for classification, but distance measures used in this algorithm are usually designed to work with real and known data. In real application the input values are imperfect—imprecise, uncertain and even missing. In the most applications, the last issue is solved using marginalization or imputation. These methods unfortunately have many drawbacks. Choice of specific imputation has big impact on classifier answer. On the other hand, marginalization can cause that even a large part of possessed data may be ignored. Therefore, in the paper a new algorithm is proposed. It is designed for work with interval type of input data and in case of lacks in the sample analyses whole domain of possible values for corresponding attributes. Proposed system generalize k-nn algorithm and gives rough-specific answer, which states if the test sample may or must belong to the certain set of classes. The important feature of the proposed system is, that it reduces the set of the possible classes and specifies the set of certain classes in the way of filling the missing values by set of possible values.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Arlot, S., Celisse, A.: A survey of cross-validation procedures for model selection. Stat. Surv. 4, 40–79 (2010)
Bao, Y., Du, X., Ishii, N.: Improving performance of the k-nearest neighbor classifier by tolerant rough sets. In Proceedings of the Third International Symposium on Cooperative Database Systems for Advanced Applications, pp. 167–171 (2001)
Collective work. Uci machine learning repository. http://archive.ics.uci.edu/ml/datasets.html
Cover, T., Hart, P.: Nearest neighbor pattern classification. IEEE Trans. Inf. Theory 13(1), 21–27 (1967)
Cpałka, K., Rutkowski, L.: Flexible takagi-sugeno fuzzy systems. In: Proceedings of IEEE International Joint Conference on Neural Networks (IJCNN), vol. 3, pp. 1764–1769 (2005)
Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification. A Wiley-Interscience Publication, Wiley, New York (2001)
Fisher, R.: The use of multiple measurements in taxonomic problems. Ann. Eugen. 7, 179–188 (1936)
Gabryel, M., Korytkowski, M., Scherer, R., Rutkowski, L.: Object detection by simple fuzzy classifiers generated by boosting. In: Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz, R., Zadeh, L., Zurada, J., (eds.), LNCS, vol. 7894, pp. 540–547. Springer, Berlin (2013)
Greblicki, W., Rutkowski, L.: Density-free Bayes risk consistency of nonparametric pattern recognition procedures. Proc. IEEE 69(4), 482–483 (1981)
He, M., Du, Y.-P.: Research on attribute reduction using rough neighborhood model. In: Proceedings of International Seminar on Business and Information Management (ISBIM), vol. 1, pp. 268–270 (2008)
Ishii, N., Torii, I., Bao, Y., Tanaka, H.: Modified reduct: nearest neighbor classification. In: Proceedings of IEEE/ACIS 11th International Conference on Computer and Information Science (ICIS), pp. 310–315 (2012)
Ishii, N., Torii, I., Bao, Y., Tanaka, H.: Mapping of nearest neighbor for classification. In: Proceedings of IEEE/ACIS 12th International Conference on Computer and Information Science (ICIS), pp. 121–126 (2013)
Keller, J., Gray, M., Givens, J.: A fuzzy k-nearest neighbor algorithm. IEEE Trans. Syst. Man Cybern. 15(4), 580–585 (1985)
Nowicki, R.: On combining neuro-fuzzy architectures with the rough set theory to solve classification problems with incomplete data. IEEE Trans. Knowl. Data Eng. 20(9), 1239–1253 (2008)
Nowicki, R.: Rough-neuro-fuzzy structures for classification with missing data. IEEE Trans. Syst. Man Cybern.-Part B: Cybern. 39(6), 1334–1347 (2009)
Nowicki, R.: On classification with missing data using rough-neuro-fuzzy systems. Int. J. Appl. Math. Comput. Sci. 20(1), 55–67 (2010)
Nowicki, R.K., Nowak, B.A., Woźniak, M.: Rough k nearest neighbours for classification in the case of missing input data. In: Proceedings of the 9th International Conferenceon Knowledge, Information and Creativity Support Systems, pp. 196–207 (2014)
Pawlak, M.: Kernel classification rules from missing data. IEEE Trans. Inf. Theory 39, 979–988 (1993)
Pawlak, Z.: Rough Sets: Theoretical Aspects of Reasoning About Data. Kluwer, Dordrecht (1991)
Pawlak, Z.: Rough sets, decision algorithms and bayes theorem. Eur. J. Oper. Res. 136, 181–189 (2002)
Pedrycz, W., Bargiela, A.: Granular clustering: a granular signature of data. IEEE Trans. Syst. Man Cybern.-Part B: Cybern. 32(2), 212–224 (2002)
Rutkowski, L.: On Bayes risk consistent pattern recognition procedures in a quasi-stationary environment. IEEE Trans. Pattern Anal. Mach. Intell. PAMI 4(1), 84–87 (1982)
Rutkowski, L.: Adaptive probabilistic neural networks for pattern classification in time-varying environment. IEEE Trans. Neural Netw. 15(4), 811–827 (2004)
Rutkowski, L., Cpałka, K.: Compromise approach to neuro-fuzzy systems. In: Sincak, P., Vascak, J., Kvasnicka, V., Pospichal, J. (eds.), Intelligent Technologies—Theory and Applications, vol. 76, pp. 85–90. IOS Press (2002)
Sarkar, M.: Fuzzy-rough nearest neighbors algorithm. In: Proceedings of IEEE International Conference on Systems, Man, and Cybernetics, vol. 5, pp. 3556–3561 (2000)
Scherer, R.: Neuro-fuzzy systems with relation matrix. In: Rutkowski, L., Scherer, R., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds.) LNAI, vol. 6113, pp. 210–215. Springer, Berlin (2010)
Shafer, G.: A Mathematical Theory of Evidence. Princeton University Press, Princeton (1976)
Sigillito, V., Wing, S., Hutton, L., Baker, K.: Classification of radar returns from the ionosphere using neural networks. Johns Hopkins APL Tech. Dig. 262–266 (1989)
Verbiest, N., Cornelis, C., Jensen, R.: Fuzzy rough positive region based nearest neighbour classification. In: Proceedings of IEEE International Conference on Fuzzy Systems (FUZZ-IEEE), pp. 1–7 (2012)
Villmann, T., Schleif, F., Hammer, B.: Fuzzy labeled soft nearest neighbor classification with relevance learning. In: Proceedings of Fourth International Conference on Machine Learning and Applications, pp. 11–15 (2005)
Wolberg, W., Mangasarian, O.: Multisurface Method of Pattern Separation for Medical Diagnosis Applied to Breast Cytology. In: Proceedings of the National Academy of Sciences, vol. 87, pp. 9193–9196. U.S.A. (1990)
Yager, R.: Using fuzzy methods to model nearest neighbor rules. IEEE Trans. Syst. Man Cybern. Part B: Cybern. 32(4), 512–525 (2002)
Acknowledgments
The project was funded by the National Science Centre under decision number DEC-2012/05/B/ST6/03620.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Nowicki, R.K., Nowak, B.A., Woźniak, M. (2016). Application of Rough Sets in k Nearest Neighbours Algorithm for Classification of Incomplete Samples. In: Kunifuji, S., Papadopoulos, G., Skulimowski, A., Kacprzyk , J. (eds) Knowledge, Information and Creativity Support Systems. Advances in Intelligent Systems and Computing, vol 416. Springer, Cham. https://doi.org/10.1007/978-3-319-27478-2_17
Download citation
DOI: https://doi.org/10.1007/978-3-319-27478-2_17
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-27477-5
Online ISBN: 978-3-319-27478-2
eBook Packages: EngineeringEngineering (R0)