Skip to main content

FIT2COMIn – Robust Clustering Algorithm for Incomplete Data

  • Conference paper
  • First Online:
Man-Machine Interactions 6 (ICMMI 2019)

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1061 ))

Included in the following conference series:

  • 366 Accesses

Abstract

In the paper we propose a new fuzzy interval type-2 C-ordered-means clustering algorithm for incomplete data. The algorithm uses both marginalisation and imputation to handle missing values. Thanks to imputation values in incomplete items are not lost, thanks to marginalisation imputed data can be distinguished from original complete items. The algorithm elaborates rough fuzzy sets (interval type-2 fuzzy sets) to model imprecision and incompleteness of data. For handling outliers the algorithm uses loss functions, ordering technique, and typicalities. Outliers are assigned with low values of typicalities. The paper describes also a new imputation technique–imputation with values from k nearest neighbours.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Cooke, M., Green, P., Josifovski, L., Vizinho, A.: Robust automatic speech recognition with missing and unreliable acoustic data. Speech Commun. 34, 267–285 (2001)

    Article  Google Scholar 

  2. Dixon, J.K.: Pattern recognition with partly missing data. IEEE Trans. Syst. Man Cybern. SMC-9, 617–621 (1979)

    Article  Google Scholar 

  3. D’Urso, P., Leski, J.M.: Fuzzy clustering of fuzzy data based on robust loss functions and ordered weighted averaging. Fuzzy Sets Syst. (2019)

    Google Scholar 

  4. Frank, A., Asuncion, A.: UCI machine learning repository (2010)

    Google Scholar 

  5. Grzymała-Busse, J.: A rough set approach to data with missing attribute values. In: Wang, G., Peters, J., Skowron, A., Yao, Y. (eds.) Rough Sets and Knowledge Technology. Lecture Notes in Computer Science, vol. 4062, pp. 58–67. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  6. Krishnapuram, R., Keller, J.: A possibilistic approach to clustering. IEEE Trans. Fuzzy Syst. 1, 98–110 (1993)

    Article  Google Scholar 

  7. Leski, J., Kotas, M.: On robust fuzzy \(c\)-regression models. Fuzzy Sets Syst. 279, 112–129 (2015)

    Article  MathSciNet  Google Scholar 

  8. Leski, J.M.: Fuzzy \(c\)-ordered-means clustering. Fuzzy Sets Syst. 286, 114–133 (2014)

    Article  MathSciNet  Google Scholar 

  9. Masson, M.-H., Denœux, T.: ECM: an evidential version of the fuzzy c-means algorithm. Pattern Recogn. 41, 1384–1397 (2008)

    Article  Google Scholar 

  10. Matyja, A., Siminski, K.: Comparison of algorithms for clustering incomplete data. Found. Comput. Decis. Sci. 39(2), 107–127 (2014)

    Article  Google Scholar 

  11. Nowicki, R.: Rough-neuro-fuzzy system with MICOG defuzzification. In: 2006 IEEE International Conference on Fuzzy Systems, Vancouver, Canada, pp. 1958–1965 (2006)

    Google Scholar 

  12. Renz, C. Rajapakse, J.C., Razvi, K., Liang, S.K.C.: Ovarian cancer classification with missing data. In: Proceedings of the 9th International Conference on Neural Information Processing, ICONIP 2002, Singapore, vol. 2, pp. 809–813 (2002)

    Google Scholar 

  13. Sikora, M., Sikora, B.: Application of machine learning for prediction a methane concentration in a coal-mine. Arch. Min. Sci. 51(4), 475–492 (2006)

    Google Scholar 

  14. Siminski, K.: Neuro-rough-fuzzy approach for regression modelling from missing data. Int. J. Appl. Math. Comput. Sci. 22(2), 461–476 (2012)

    Article  Google Scholar 

  15. Siminski, K.: Clustering with missing values. Fundamenta Informaticae 123(3), 331–350 (2013)

    MATH  Google Scholar 

  16. Siminski, K.: Rough subspace neuro-fuzzy system. Fuzzy Sets Syst. 269, 30–46 (2015)

    Article  MathSciNet  Google Scholar 

  17. Siminski, K.: Imputation of missing values by inversion of fuzzy neuro-system. In: Gruca, A., Brachman, A., Kozielski, S., Czachórski, T. (eds.) Man–Machine Interactions 4, pp. 573–582. Springer, Cham (2016)

    Chapter  Google Scholar 

  18. Siminski, K.: Fuzzy weighted c-ordered means clustering algorithm. Fuzzy Sets Syst. 318, 1–33 (2017)

    Article  MathSciNet  Google Scholar 

  19. Siminski, K.: NFL - free library for fuzzy and neuro-fuzzy systems. In: Kozielski, S. (ed.) Beyond Databases, Architectures and Structures. Paving the Road to Smart Data Processing and Analysis, pp. 139–150. Springer, Cham (2019)

    Chapter  Google Scholar 

  20. Timm, H., Borgelt, C., Döring, C., Kruse, R.: An extension to possibilistic fuzzy cluster analysis. Fuzzy Sets Syst. 147, 3–16 (2004)

    Article  MathSciNet  Google Scholar 

  21. Troyanskaya, O., Cantor, M., Sherlock, G., Brown, P., Hastie, T., Tibshirani, R., Botstein, D., Altman, R.B.: Missing value estimation methods for DNA microarrays. Bioinformatics 17(6), 520–525 (2001)

    Article  Google Scholar 

  22. Kuo-Lung, W., Yang, M.-S.: Alternative c-means clustering algorithms. Pattern Recogn. 35, 2267–2278 (2002)

    Article  Google Scholar 

  23. Yager, R.R.: On ordered weighted averaging aggregation operators in multicriteria decisionmaking. IEEE Trans. Syst. Man Cybern. 18(1), 183–190 (1988)

    Article  MathSciNet  Google Scholar 

  24. Yang, M.-S., Kuo-Lung, W.: Unsupervised possibilistic clustering. Pattern Recogn. 39, 5–21 (2006)

    Article  Google Scholar 

  25. Cheng Yeh, I.: Modeling of strength of high-performance concrete using artificial neural networks. Cem. Concr. Res. 28(12), 1797–1808 (1998)

    Article  Google Scholar 

Download references

Acknowledgements

The research has been supported by the Rector’s Grant for Research and Development (Silesian University of Technology, grant number: 02/020/RGJ19/0165).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Krzysztof Siminski .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Siminski, K. (2020). FIT2COMIn – Robust Clustering Algorithm for Incomplete Data. In: Gruca, A., Czachórski, T., Deorowicz, S., Harężlak, K., Piotrowska, A. (eds) Man-Machine Interactions 6. ICMMI 2019. Advances in Intelligent Systems and Computing, vol 1061 . Springer, Cham. https://doi.org/10.1007/978-3-030-31964-9_10

Download citation

Publish with us

Policies and ethics