Abstract
Since it is an unsupervised data analysis approach, clustering relies solely on the location of the data points in the data space or, alternatively, on their relative distances or similarities. As a consequence, clustering can suffer from the presence of noisy data points and outliers, which can obscure the structure of the clusters in the data and thus may drive clustering algorithms to yield suboptimal or even misleading results. Fuzzy clustering is no exception in this respect, although it features an aspect of robustness, due to which outliers and generally data points that are atypical for the clusters in the data have a lesser influence on the cluster parameters. Starting from this aspect, we provide in this paper an overview of different approaches with which fuzzy clustering can be made less sensitive to noise and outliers and categorize them according to the component of standard fuzzy clustering they modify.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Note that this approach is not restricted to fuzzy clustering, but can be applied for any clustering scheme, including classical \(c\)-means clustering.
- 2.
Note that computing the membership degrees remains unchanged, regardless of the distance measure and whether it is squared or not, because for this computation the cluster prototypes are fixed and thus the distances are effectively constants.
References
Everitt, B.S.: Cluster Analysis. Heinemann, London (1981)
Jain, A.K., Dubes, R.C.: Algorithms for Clustering Data. Prentice Hall, Englewood Cliffs (1988)
Kaufman, L., Rousseeuw, P.: Finding Groups in Data: An Introduction to Cluster Analysis. Wiley, New York (1990)
Höppner, F., Klawonn, F., Kruse, R., Runkler, T.: Fuzzy Cluster Analysis. Wiley, Chichester (1999)
Ruspini, E.H.: A new approach to clustering. Inf. Control 15(1), 22–32 (1969). Reprinted in [47], 63–70 (Academic Press, San Diego)
Bezdek, J.C.: Pattern Recognition with Fuzzy Objective Function Algorithms. Plenum Press, New York (1981)
Bezdek, J.C., Keller, J., Krishnapuram, R., Pal, N.: Fuzzy Models and Algorithms for Pattern Recognition and Image Processing. Kluwer, Dordrecht (1999)
Ohashi, Y.: Fuzzy clustering and robust estimation. In: Proceedings 9th Meeting SAS Users Group International Hollywood Beach, FL, USA (1984)
Davé, R.N.: Characterization and detection of noise in clustering. Pattern Recogn. Lett. 12, 657–664 (1991) (Elsevier Science, Amsterdam)
Davé, R.N., Sen, S.: On generalizing the noise clustering algorithms. In: Proceedings 7th International Fuzzy Systems Association World Congress (IFSA’97), 3, 205–210. Academia, Prague, Czech Republic (1997)
Keller, A.: Fuzzy clustering with outliers. In: Proceedings 19th Conference North American Fuzzy Information Processing Society (NAFIPS’00, Atlanta, Canada), pp. 143–147. IEEE Press, Piscataway, NJ, USA (2000)
Krishnapuram, R., Keller, J.M.: A possibilistic approach to clustering. IEEE Trans. Fuzzy Syst. 1(2) , 98–110 (1993) (IEEE Press, Piscataway)
Krishnapuram, R., Keller, J.M.: The possibilistic \(c\)-means algorithm: insights and recommendations. IEEE Trans. Fuzzy Syst. 4(3), 385–393 (1996) (IEEE Press, Piscataway)
Pal, N.R., Pal, K., Bezdek, J.C.: A mixed C-means clustering model. In: Proceedings 6th IEEE International Conference on Fuzzy Systems (FUZZIEEE’97, Barcelona, Spain), pp. 11–21. IEEE Press, Piscataway, NJ, USA (1997)
Pal, N.R., Pal, K., Keller, J.M., Bezdek, J.C.: A new hybrid C-means clustering model. In: Proceedings 13th IEEE International Conference on Fuzzy Systems (FUZZIEEE’04, Budapest, Hungary), pp. 179–184. IEEE Press, Piscataway, NJ, USA (2004)
Pal, N.R., Pal, K., Keller, J.M., Bezdek, J.C.: A possibilistic fuzzy \(C\)-means clustering algorithm. IEEE Trans. Fuzzy Syst. 13(4), 517–530 (2005) (IEEE Press, Piscataway)
Masulli, F., Rosetta, S.: Soft transition from probabilistic to possibilistic fuzzy clustering. IEEE Trans. Fuzzy Syst. 14(4), 516–527 (2006) (IEEE Press, Piscataway)
Honda, K., Ichihashi, H., Notsu, A., Masulli, F., Rovetta, S.: Formulations, several, for graded possibilistic approach to fuzzy clustering. In: Proceedings 5th International Conference Rough Sets and Current Trends in Computing (RSCTC, : Kobe, Japan), pp. 939–948. Springer-Verlag, Berlin/Heidelberg, Germany (2006)
Klawonn, F., Höppner, F.: What is fuzzy about fuzzy clustering? understanding and improving the concept of the fuzzifier. In: Proceedings 5th International Symposium on Intelligent Data Analysis (IDA: Berlin, Germany), pp. 254–264. Springer-Verlag, Berlin, Germany (2003)
Jajuga, K.: \(L_1\)-norm based fuzzy clustering. Fuzzy Sets Syst. 39(1), 43–50 (1991) (Elsevier Science, Amsterdam)
Groenen, P.J.F., Jajuga, K.: Fuzzy clustering with squared minkowski distances. Fuzzy Sets Syst. 120, 227–237 (2001) (Elsevier Science, Amsterdam)
Groenen, P.J.F., Kaymak, U., van Rosmalen, J.: Fuzzy clustering with minkowski distance functions. In: Chapter 3 of Valente de Oliveira, J., Pedrycz, W. (eds.) Advances in Fuzzy Clustering and Its Applications. Wiley, Chichester (2007)
Runkler, T.A., Bezdek, J.C.: Alternating cluster estimation: a new tool for clustering and function approximation. IEEE Trans. Fuzzy Syst. 7(4), 377–393 (1999) (IEEE Press, Piscataway)
Łȩski, J.: An \(\varepsilon \)-insensitive approach to fuzzy clustering. Int. J. Appl. Math. Comput. Sci. 11(4), 993–1007 (2001) (University of Zielona Góra, Poland)
Frigui, H., Krishnapuram, R.: A robust algorithm for automatic extraction of an unknown number of clusters from noisy data. Pattern Recogn. Lett. 17, 1223–1232 (1996) (Elsevier Science, Amsterdam)
Borgelt, C.: Prototype-based Classification and Clustering. Otto-von-Guericke-University of Magdeburg, Germany, Habilitationsschrift (2005)
Ball, G.H., Hall, D.J.: A clustering technique for summarizing multivariate data. Behav. Sci. 12(2), 153–155 (1967) (Wiley, Chichester)
Hartigan, J.A., Wong, M.A.: A \(k\)-means clustering algorithm. Appl. Stat. 28, 100–108 (1979) (Blackwell, Oxford)
Lloyd, S.: Least squares quantization in PCM. IEEE Trans. Inf. Theory 28, 129–137 (1982) (IEEE Press, Piscataway)
Dunn, J.C.: A fuzzy relative of the ISODATA process and its use in detecting compact well-separated clusters. J. Cybern. 3(3), 32–57 (1973). Reprinted in [47], 82–101 (American Society for Cybernetics, Washington)
Borgelt, C.: Objective functions for fuzzy clustering. In: Moewes, C., Nürnberger, A. (eds.) Computational Intelligence in Intelligent Data Analysis, 3–16. Springer, Berlin/Heidelberg (2012)
Gustafson, E.E., Kessel, W.C.: Fuzzy clustering with a fuzzy covariance matrix. In: Proceedings of the IEEE Conference on Decision and Control (CDC 1979, San Diego, CA), pp. 761–766. IEEE Press, Piscataway, NJ, USA (1979). Reprinted in [47], 117–122
Gath, I., Gevam, A.B.: Unsupervised optimal fuzzy clustering. IEEE Trans. Pattern Anal. Mach. Intell. (PAMI) 11, 773–781 (1989). Reprinted in [47], 211–218 (IEEE Press, Piscataway)
Davé, R.N., Krishnapuram, R.: Robust clustering methods: a unified view. IEEE Trans. Fuzzy Syst. 5, 270–293 (1997) (IEEE Press, Piscataway)
Davé, R.N., Sumit, S.: Generalized noise clustering as a robust fuzzy C-M-estimators model. In: Proceedings 17th Annual Conference North American Fuzzy Information Processing Society (NAFIPS’98, Pensacola Beach, Florida), pp. 256–260. IEEE Press, Piscataway, NJ, USA (1998)
Klawonn, F.: Noise clustering with a fixed fraction of noise. In: Lotfi, A., Garibaldi, M. (eds.) Applications and Science in Soft Computing, 133–138. Springer, Berlin/Heidelberg (2004)
Rehm, F., Klawonn, F., Kruse, R.: A novel approach to noise clustering for outlier detection. Soft Comput. 11(5), 489–494. Springer, Berlin/Heidelberg (2007)
Cimino, M.G.C.A., Frosini, G., Lazzerini, B., Marcelloni, F.: On the noise distance in robust fuzzy C-means. In: Proceedings International Conference on Computational Intelligence (ICCI, : Istanbul, Turkey), pp. 361–364. Intelligence Society, International Compliance (2004)
Timm, H., Borgelt, C., Döring, C., Kruse, R.: An extension to possibilistic fuzzy cluster analysis. Fuzzy Sets Syst. 147, 3–16 (2004) (Elsevier Science, Amsterdam)
Rousseeuw, P.J., Leroy, A.M.: Robust Regression and Outlier Detection. Wiley, Chichester (1987)
Hathaway, R.J., Devenport, J.W., Bezdek, J.C.: Relational dual of the C-means clustering algorithm. Pattern Recogn. 22(2), 205–212 (1989) (Elsevier, Amsterdam)
Krishnapuram, R., Joshi, A., Yi, L.: A fuzzy relative of the K-medoids algorithm with application to document and snippet clustering. In: Proceedings 8th IEEE International Conference on Fuzzy Systems (FUZZ-IEEE’99, Seoul, Korea), 3, 1281–1286. IEEE Press, Piscataway, NJ, USA (1999)
Sen, S., Dave, R.N.: Clustering of relational data containing noise and outliers. In: Proceedings 7th IEEE International Conference on Fuzzy Systems (FUZZ-IEEE’98, Anchorage, Alaska), 3, 1411–1416. IEEE Press, Piscataway, NJ, USA (1998)
Bobrowski, L., Bezdek, J.C.: C-means clustering with the \(L_1\) and \(L_\infty \) norms. IEEE Trans. Syst. Man Cybern. 21(3), 545–554 (1991) (IEEE Press, Piscataway)
Hampel, F.R., Ronchetti, E.M., Rousseeuw, P.J., Stahel, W.A.: Robust Statistics: The Approach Based on Influence Functions. Wiley, New York (1986)
Binu, T., Raju, G.: A novel fuzzy clustering method for outlier detection in data mining. Int. J. Recent Trends Eng. 1(2), 161–165 (2009) (Academy Publisher, British Virgin Islands)
Bezdek, J.C., Pal, N.R.: Fuzzy Models for Pattern Recognition. IEEE Press, New York (1992)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Borgelt, C., Braune, C., Lesot, MJ., Kruse, R. (2015). Handling Noise and Outliers in Fuzzy Clustering. In: Tamir, D., Rishe, N., Kandel, A. (eds) Fifty Years of Fuzzy Logic and its Applications. Studies in Fuzziness and Soft Computing, vol 326. Springer, Cham. https://doi.org/10.1007/978-3-319-19683-1_17
Download citation
DOI: https://doi.org/10.1007/978-3-319-19683-1_17
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-19682-4
Online ISBN: 978-3-319-19683-1
eBook Packages: EngineeringEngineering (R0)