Skip to main content

Handling Noise and Outliers in Fuzzy Clustering

  • Chapter
  • First Online:

Part of the book series: Studies in Fuzziness and Soft Computing ((STUDFUZZ,volume 326))

Abstract

Since it is an unsupervised data analysis approach, clustering relies solely on the location of the data points in the data space or, alternatively, on their relative distances or similarities. As a consequence, clustering can suffer from the presence of noisy data points and outliers, which can obscure the structure of the clusters in the data and thus may drive clustering algorithms to yield suboptimal or even misleading results. Fuzzy clustering is no exception in this respect, although it features an aspect of robustness, due to which outliers and generally data points that are atypical for the clusters in the data have a lesser influence on the cluster parameters. Starting from this aspect, we provide in this paper an overview of different approaches with which fuzzy clustering can be made less sensitive to noise and outliers and categorize them according to the component of standard fuzzy clustering they modify.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD   169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    Note that this approach is not restricted to fuzzy clustering, but can be applied for any clustering scheme, including classical \(c\)-means clustering.

  2. 2.

    Note that computing the membership degrees remains unchanged, regardless of the distance measure and whether it is squared or not, because for this computation the cluster prototypes are fixed and thus the distances are effectively constants.

References

  1. Everitt, B.S.: Cluster Analysis. Heinemann, London (1981)

    Google Scholar 

  2. Jain, A.K., Dubes, R.C.: Algorithms for Clustering Data. Prentice Hall, Englewood Cliffs (1988)

    Google Scholar 

  3. Kaufman, L., Rousseeuw, P.: Finding Groups in Data: An Introduction to Cluster Analysis. Wiley, New York (1990)

    Google Scholar 

  4. Höppner, F., Klawonn, F., Kruse, R., Runkler, T.: Fuzzy Cluster Analysis. Wiley, Chichester (1999)

    Google Scholar 

  5. Ruspini, E.H.: A new approach to clustering. Inf. Control 15(1), 22–32 (1969). Reprinted in [47], 63–70 (Academic Press, San Diego)

    Google Scholar 

  6. Bezdek, J.C.: Pattern Recognition with Fuzzy Objective Function Algorithms. Plenum Press, New York (1981)

    Google Scholar 

  7. Bezdek, J.C., Keller, J., Krishnapuram, R., Pal, N.: Fuzzy Models and Algorithms for Pattern Recognition and Image Processing. Kluwer, Dordrecht (1999)

    Google Scholar 

  8. Ohashi, Y.: Fuzzy clustering and robust estimation. In: Proceedings 9th Meeting SAS Users Group International Hollywood Beach, FL, USA (1984)

    Google Scholar 

  9. Davé, R.N.: Characterization and detection of noise in clustering. Pattern Recogn. Lett. 12, 657–664 (1991) (Elsevier Science, Amsterdam)

    Google Scholar 

  10. Davé, R.N., Sen, S.: On generalizing the noise clustering algorithms. In: Proceedings 7th International Fuzzy Systems Association World Congress (IFSA’97), 3, 205–210. Academia, Prague, Czech Republic (1997)

    Google Scholar 

  11. Keller, A.: Fuzzy clustering with outliers. In: Proceedings 19th Conference North American Fuzzy Information Processing Society (NAFIPS’00, Atlanta, Canada), pp. 143–147. IEEE Press, Piscataway, NJ, USA (2000)

    Google Scholar 

  12. Krishnapuram, R., Keller, J.M.: A possibilistic approach to clustering. IEEE Trans. Fuzzy Syst. 1(2) , 98–110 (1993) (IEEE Press, Piscataway)

    Google Scholar 

  13. Krishnapuram, R., Keller, J.M.: The possibilistic \(c\)-means algorithm: insights and recommendations. IEEE Trans. Fuzzy Syst. 4(3), 385–393 (1996) (IEEE Press, Piscataway)

    Google Scholar 

  14. Pal, N.R., Pal, K., Bezdek, J.C.: A mixed C-means clustering model. In: Proceedings 6th IEEE International Conference on Fuzzy Systems (FUZZIEEE’97, Barcelona, Spain), pp. 11–21. IEEE Press, Piscataway, NJ, USA (1997)

    Google Scholar 

  15. Pal, N.R., Pal, K., Keller, J.M., Bezdek, J.C.: A new hybrid C-means clustering model. In: Proceedings 13th IEEE International Conference on Fuzzy Systems (FUZZIEEE’04, Budapest, Hungary), pp. 179–184. IEEE Press, Piscataway, NJ, USA (2004)

    Google Scholar 

  16. Pal, N.R., Pal, K., Keller, J.M., Bezdek, J.C.: A possibilistic fuzzy \(C\)-means clustering algorithm. IEEE Trans. Fuzzy Syst. 13(4), 517–530 (2005) (IEEE Press, Piscataway)

    Google Scholar 

  17. Masulli, F., Rosetta, S.: Soft transition from probabilistic to possibilistic fuzzy clustering. IEEE Trans. Fuzzy Syst. 14(4), 516–527 (2006) (IEEE Press, Piscataway)

    Google Scholar 

  18. Honda, K., Ichihashi, H., Notsu, A., Masulli, F., Rovetta, S.: Formulations, several, for graded possibilistic approach to fuzzy clustering. In: Proceedings 5th International Conference Rough Sets and Current Trends in Computing (RSCTC, : Kobe, Japan), pp. 939–948. Springer-Verlag, Berlin/Heidelberg, Germany (2006)

    Google Scholar 

  19. Klawonn, F., Höppner, F.: What is fuzzy about fuzzy clustering? understanding and improving the concept of the fuzzifier. In: Proceedings 5th International Symposium on Intelligent Data Analysis (IDA: Berlin, Germany), pp. 254–264. Springer-Verlag, Berlin, Germany (2003)

    Google Scholar 

  20. Jajuga, K.: \(L_1\)-norm based fuzzy clustering. Fuzzy Sets Syst. 39(1), 43–50 (1991) (Elsevier Science, Amsterdam)

    Google Scholar 

  21. Groenen, P.J.F., Jajuga, K.: Fuzzy clustering with squared minkowski distances. Fuzzy Sets Syst. 120, 227–237 (2001) (Elsevier Science, Amsterdam)

    Google Scholar 

  22. Groenen, P.J.F., Kaymak, U., van Rosmalen, J.: Fuzzy clustering with minkowski distance functions. In: Chapter 3 of Valente de Oliveira, J., Pedrycz, W. (eds.) Advances in Fuzzy Clustering and Its Applications. Wiley, Chichester (2007)

    Google Scholar 

  23. Runkler, T.A., Bezdek, J.C.: Alternating cluster estimation: a new tool for clustering and function approximation. IEEE Trans. Fuzzy Syst. 7(4), 377–393 (1999) (IEEE Press, Piscataway)

    Google Scholar 

  24. Łȩski, J.: An \(\varepsilon \)-insensitive approach to fuzzy clustering. Int. J. Appl. Math. Comput. Sci. 11(4), 993–1007 (2001) (University of Zielona Góra, Poland)

    Google Scholar 

  25. Frigui, H., Krishnapuram, R.: A robust algorithm for automatic extraction of an unknown number of clusters from noisy data. Pattern Recogn. Lett. 17, 1223–1232 (1996) (Elsevier Science, Amsterdam)

    Google Scholar 

  26. Borgelt, C.: Prototype-based Classification and Clustering. Otto-von-Guericke-University of Magdeburg, Germany, Habilitationsschrift (2005)

    Google Scholar 

  27. Ball, G.H., Hall, D.J.: A clustering technique for summarizing multivariate data. Behav. Sci. 12(2), 153–155 (1967) (Wiley, Chichester)

    Google Scholar 

  28. Hartigan, J.A., Wong, M.A.: A \(k\)-means clustering algorithm. Appl. Stat. 28, 100–108 (1979) (Blackwell, Oxford)

    Google Scholar 

  29. Lloyd, S.: Least squares quantization in PCM. IEEE Trans. Inf. Theory 28, 129–137 (1982) (IEEE Press, Piscataway)

    Google Scholar 

  30. Dunn, J.C.: A fuzzy relative of the ISODATA process and its use in detecting compact well-separated clusters. J. Cybern. 3(3), 32–57 (1973). Reprinted in [47], 82–101 (American Society for Cybernetics, Washington)

    Google Scholar 

  31. Borgelt, C.: Objective functions for fuzzy clustering. In: Moewes, C., Nürnberger, A. (eds.) Computational Intelligence in Intelligent Data Analysis, 3–16. Springer, Berlin/Heidelberg (2012)

    Google Scholar 

  32. Gustafson, E.E., Kessel, W.C.: Fuzzy clustering with a fuzzy covariance matrix. In: Proceedings of the IEEE Conference on Decision and Control (CDC 1979, San Diego, CA), pp. 761–766. IEEE Press, Piscataway, NJ, USA (1979). Reprinted in [47], 117–122

    Google Scholar 

  33. Gath, I., Gevam, A.B.: Unsupervised optimal fuzzy clustering. IEEE Trans. Pattern Anal. Mach. Intell. (PAMI) 11, 773–781 (1989). Reprinted in [47], 211–218 (IEEE Press, Piscataway)

    Google Scholar 

  34. Davé, R.N., Krishnapuram, R.: Robust clustering methods: a unified view. IEEE Trans. Fuzzy Syst. 5, 270–293 (1997) (IEEE Press, Piscataway)

    Google Scholar 

  35. Davé, R.N., Sumit, S.: Generalized noise clustering as a robust fuzzy C-M-estimators model. In: Proceedings 17th Annual Conference North American Fuzzy Information Processing Society (NAFIPS’98, Pensacola Beach, Florida), pp. 256–260. IEEE Press, Piscataway, NJ, USA (1998)

    Google Scholar 

  36. Klawonn, F.: Noise clustering with a fixed fraction of noise. In: Lotfi, A., Garibaldi, M. (eds.) Applications and Science in Soft Computing, 133–138. Springer, Berlin/Heidelberg (2004)

    Google Scholar 

  37. Rehm, F., Klawonn, F., Kruse, R.: A novel approach to noise clustering for outlier detection. Soft Comput. 11(5), 489–494. Springer, Berlin/Heidelberg (2007)

    Google Scholar 

  38. Cimino, M.G.C.A., Frosini, G., Lazzerini, B., Marcelloni, F.: On the noise distance in robust fuzzy C-means. In: Proceedings International Conference on Computational Intelligence (ICCI, : Istanbul, Turkey), pp. 361–364. Intelligence Society, International Compliance (2004)

    Google Scholar 

  39. Timm, H., Borgelt, C., Döring, C., Kruse, R.: An extension to possibilistic fuzzy cluster analysis. Fuzzy Sets Syst. 147, 3–16 (2004) (Elsevier Science, Amsterdam)

    Google Scholar 

  40. Rousseeuw, P.J., Leroy, A.M.: Robust Regression and Outlier Detection. Wiley, Chichester (1987)

    Google Scholar 

  41. Hathaway, R.J., Devenport, J.W., Bezdek, J.C.: Relational dual of the C-means clustering algorithm. Pattern Recogn. 22(2), 205–212 (1989) (Elsevier, Amsterdam)

    Google Scholar 

  42. Krishnapuram, R., Joshi, A., Yi, L.: A fuzzy relative of the K-medoids algorithm with application to document and snippet clustering. In: Proceedings 8th IEEE International Conference on Fuzzy Systems (FUZZ-IEEE’99, Seoul, Korea), 3, 1281–1286. IEEE Press, Piscataway, NJ, USA (1999)

    Google Scholar 

  43. Sen, S., Dave, R.N.: Clustering of relational data containing noise and outliers. In: Proceedings 7th IEEE International Conference on Fuzzy Systems (FUZZ-IEEE’98, Anchorage, Alaska), 3, 1411–1416. IEEE Press, Piscataway, NJ, USA (1998)

    Google Scholar 

  44. Bobrowski, L., Bezdek, J.C.: C-means clustering with the \(L_1\) and \(L_\infty \) norms. IEEE Trans. Syst. Man Cybern. 21(3), 545–554 (1991) (IEEE Press, Piscataway)

    Google Scholar 

  45. Hampel, F.R., Ronchetti, E.M., Rousseeuw, P.J., Stahel, W.A.: Robust Statistics: The Approach Based on Influence Functions. Wiley, New York (1986)

    Google Scholar 

  46. Binu, T., Raju, G.: A novel fuzzy clustering method for outlier detection in data mining. Int. J. Recent Trends Eng. 1(2), 161–165 (2009) (Academy Publisher, British Virgin Islands)

    Google Scholar 

  47. Bezdek, J.C., Pal, N.R.: Fuzzy Models for Pattern Recognition. IEEE Press, New York (1992)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Christian Borgelt .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Borgelt, C., Braune, C., Lesot, MJ., Kruse, R. (2015). Handling Noise and Outliers in Fuzzy Clustering. In: Tamir, D., Rishe, N., Kandel, A. (eds) Fifty Years of Fuzzy Logic and its Applications. Studies in Fuzziness and Soft Computing, vol 326. Springer, Cham. https://doi.org/10.1007/978-3-319-19683-1_17

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-19683-1_17

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-19682-4

  • Online ISBN: 978-3-319-19683-1

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics