Skip to main content

Application of Rough Sets in k Nearest Neighbours Algorithm for Classification of Incomplete Samples

  • Conference paper
  • First Online:

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 416))

Abstract

Algorithm k-nn is often used for classification, but distance measures used in this algorithm are usually designed to work with real and known data. In real application the input values are imperfect—imprecise, uncertain and even missing. In the most applications, the last issue is solved using marginalization or imputation. These methods unfortunately have many drawbacks. Choice of specific imputation has big impact on classifier answer. On the other hand, marginalization can cause that even a large part of possessed data may be ignored. Therefore, in the paper a new algorithm is proposed. It is designed for work with interval type of input data and in case of lacks in the sample analyses whole domain of possible values for corresponding attributes. Proposed system generalize k-nn algorithm and gives rough-specific answer, which states if the test sample may or must belong to the certain set of classes. The important feature of the proposed system is, that it reduces the set of the possible classes and specifies the set of certain classes in the way of filling the missing values by set of possible values.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Arlot, S., Celisse, A.: A survey of cross-validation procedures for model selection. Stat. Surv. 4, 40–79 (2010)

    Article  MathSciNet  Google Scholar 

  2. Bao, Y., Du, X., Ishii, N.: Improving performance of the k-nearest neighbor classifier by tolerant rough sets. In Proceedings of the Third International Symposium on Cooperative Database Systems for Advanced Applications, pp. 167–171 (2001)

    Google Scholar 

  3. Collective work. Uci machine learning repository. http://archive.ics.uci.edu/ml/datasets.html

  4. Cover, T., Hart, P.: Nearest neighbor pattern classification. IEEE Trans. Inf. Theory 13(1), 21–27 (1967)

    Article  Google Scholar 

  5. Cpałka, K., Rutkowski, L.: Flexible takagi-sugeno fuzzy systems. In: Proceedings of IEEE International Joint Conference on Neural Networks (IJCNN), vol. 3, pp. 1764–1769 (2005)

    Google Scholar 

  6. Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification. A Wiley-Interscience Publication, Wiley, New York (2001)

    Google Scholar 

  7. Fisher, R.: The use of multiple measurements in taxonomic problems. Ann. Eugen. 7, 179–188 (1936)

    Article  Google Scholar 

  8. Gabryel, M., Korytkowski, M., Scherer, R., Rutkowski, L.: Object detection by simple fuzzy classifiers generated by boosting. In: Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz, R., Zadeh, L., Zurada, J., (eds.), LNCS, vol. 7894, pp. 540–547. Springer, Berlin (2013)

    Google Scholar 

  9. Greblicki, W., Rutkowski, L.: Density-free Bayes risk consistency of nonparametric pattern recognition procedures. Proc. IEEE 69(4), 482–483 (1981)

    Article  Google Scholar 

  10. He, M., Du, Y.-P.: Research on attribute reduction using rough neighborhood model. In: Proceedings of International Seminar on Business and Information Management (ISBIM), vol. 1, pp. 268–270 (2008)

    Google Scholar 

  11. Ishii, N., Torii, I., Bao, Y., Tanaka, H.: Modified reduct: nearest neighbor classification. In: Proceedings of IEEE/ACIS 11th International Conference on Computer and Information Science (ICIS), pp. 310–315 (2012)

    Google Scholar 

  12. Ishii, N., Torii, I., Bao, Y., Tanaka, H.: Mapping of nearest neighbor for classification. In: Proceedings of IEEE/ACIS 12th International Conference on Computer and Information Science (ICIS), pp. 121–126 (2013)

    Google Scholar 

  13. Keller, J., Gray, M., Givens, J.: A fuzzy k-nearest neighbor algorithm. IEEE Trans. Syst. Man Cybern. 15(4), 580–585 (1985)

    Article  Google Scholar 

  14. Nowicki, R.: On combining neuro-fuzzy architectures with the rough set theory to solve classification problems with incomplete data. IEEE Trans. Knowl. Data Eng. 20(9), 1239–1253 (2008)

    Article  Google Scholar 

  15. Nowicki, R.: Rough-neuro-fuzzy structures for classification with missing data. IEEE Trans. Syst. Man Cybern.-Part B: Cybern. 39(6), 1334–1347 (2009)

    Article  Google Scholar 

  16. Nowicki, R.: On classification with missing data using rough-neuro-fuzzy systems. Int. J. Appl. Math. Comput. Sci. 20(1), 55–67 (2010)

    Article  Google Scholar 

  17. Nowicki, R.K., Nowak, B.A., Woźniak, M.: Rough k nearest neighbours for classification in the case of missing input data. In: Proceedings of the 9th International Conferenceon Knowledge, Information and Creativity Support Systems, pp. 196–207 (2014)

    Google Scholar 

  18. Pawlak, M.: Kernel classification rules from missing data. IEEE Trans. Inf. Theory 39, 979–988 (1993)

    Article  Google Scholar 

  19. Pawlak, Z.: Rough Sets: Theoretical Aspects of Reasoning About Data. Kluwer, Dordrecht (1991)

    Book  Google Scholar 

  20. Pawlak, Z.: Rough sets, decision algorithms and bayes theorem. Eur. J. Oper. Res. 136, 181–189 (2002)

    Article  MathSciNet  Google Scholar 

  21. Pedrycz, W., Bargiela, A.: Granular clustering: a granular signature of data. IEEE Trans. Syst. Man Cybern.-Part B: Cybern. 32(2), 212–224 (2002)

    Article  Google Scholar 

  22. Rutkowski, L.: On Bayes risk consistent pattern recognition procedures in a quasi-stationary environment. IEEE Trans. Pattern Anal. Mach. Intell. PAMI 4(1), 84–87 (1982)

    Article  MathSciNet  Google Scholar 

  23. Rutkowski, L.: Adaptive probabilistic neural networks for pattern classification in time-varying environment. IEEE Trans. Neural Netw. 15(4), 811–827 (2004)

    Article  MathSciNet  Google Scholar 

  24. Rutkowski, L., Cpałka, K.: Compromise approach to neuro-fuzzy systems. In: Sincak, P., Vascak, J., Kvasnicka, V., Pospichal, J. (eds.), Intelligent Technologies—Theory and Applications, vol. 76, pp. 85–90. IOS Press (2002)

    Google Scholar 

  25. Sarkar, M.: Fuzzy-rough nearest neighbors algorithm. In: Proceedings of IEEE International Conference on Systems, Man, and Cybernetics, vol. 5, pp. 3556–3561 (2000)

    Google Scholar 

  26. Scherer, R.: Neuro-fuzzy systems with relation matrix. In: Rutkowski, L., Scherer, R., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds.) LNAI, vol. 6113, pp. 210–215. Springer, Berlin (2010)

    Google Scholar 

  27. Shafer, G.: A Mathematical Theory of Evidence. Princeton University Press, Princeton (1976)

    Google Scholar 

  28. Sigillito, V., Wing, S., Hutton, L., Baker, K.: Classification of radar returns from the ionosphere using neural networks. Johns Hopkins APL Tech. Dig. 262–266 (1989)

    Google Scholar 

  29. Verbiest, N., Cornelis, C., Jensen, R.: Fuzzy rough positive region based nearest neighbour classification. In: Proceedings of IEEE International Conference on Fuzzy Systems (FUZZ-IEEE), pp. 1–7 (2012)

    Google Scholar 

  30. Villmann, T., Schleif, F., Hammer, B.: Fuzzy labeled soft nearest neighbor classification with relevance learning. In: Proceedings of Fourth International Conference on Machine Learning and Applications, pp. 11–15 (2005)

    Google Scholar 

  31. Wolberg, W., Mangasarian, O.: Multisurface Method of Pattern Separation for Medical Diagnosis Applied to Breast Cytology. In: Proceedings of the National Academy of Sciences, vol. 87, pp. 9193–9196. U.S.A. (1990)

    Google Scholar 

  32. Yager, R.: Using fuzzy methods to model nearest neighbor rules. IEEE Trans. Syst. Man Cybern. Part B: Cybern. 32(4), 512–525 (2002)

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgments

The project was funded by the National Science Centre under decision number DEC-2012/05/B/ST6/03620.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Robert K. Nowicki .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Nowicki, R.K., Nowak, B.A., Woźniak, M. (2016). Application of Rough Sets in k Nearest Neighbours Algorithm for Classification of Incomplete Samples. In: Kunifuji, S., Papadopoulos, G., Skulimowski, A., Kacprzyk  , J. (eds) Knowledge, Information and Creativity Support Systems. Advances in Intelligent Systems and Computing, vol 416. Springer, Cham. https://doi.org/10.1007/978-3-319-27478-2_17

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-27478-2_17

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-27477-5

  • Online ISBN: 978-3-319-27478-2

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics