Abstract
Clustering aims to partition a data set into homogenous groups which gather similar objects. Object similarity, or more often object dissimilarity, is usually expressed in terms of some distance function. This approach, however, is not viable when dissimilarity is conceptual rather than metric. In this paper, we propose to extract the dissimilarity relation directly from the available data. To this aim, we train a feedforward neural network with some pairs of points with known dissimilarity. Then, we use the dissimilarity measure generated by the network to guide a new unsupervised fuzzy relational clustering algorithm. An artificial data set and a real data set are used to show how the clustering algorithm based on the neural dissimilarity outperforms some widely used (possibly partially supervised) clustering algorithms based on spatial dissimilarity.
Similar content being viewed by others
Notes
Actually, our method can deal with both similarity and dissimilarity relations.
References
Jain AK, Dubes RC (1988) Algorithms for clustering. Prentice-Hall, Englewood Cliffs
Kaufman L, Rousseeuw PJ (1990) Finding groups in data. An introduction to cluster analysis. Wiley, Canada
Everitt BS, Landau S, Leese M (2001) Cluster analysis. Arnold, London
Hathaway RJ, Bezdek JC, Hu Y (2000) Generalized fuzzy c-means clustering strategies using Lp norm distances. IEEE Trans Fuzzy Syst 8(5):576 –582
Dave RN, Krishnapuram R (1997) Robust clustering methods: a unified view. IEEE Trans Fuzzy Syst 5(2):270–293
Karayiannis NB, Randolph-Gips MM (2003) Soft learning vector quantization and clustering algorithms based on non-Euclidean norms: multinorm algorithms. IEEE Trans Neural Netw 14:89–102
Valentin D, Abdi H, O’Toole AJ, Cottrell GW (1994) Connectionist models of face processing: a survey. Pattern Recognit 27:1208–1230
Kamgar-Parsi B, Jain AK (1999) Automatic aircraft recognition: toward using human similarity measure in a recognition system. In: IEEE Computer Society conference on computer vision and pattern recognition, pp 268–273
Santini S, Jain R (1999) Similarity measures. IEEE Trans Pattern Anal Mach Intell 21(9):871–883
Latecki LJ, Lakamper R (2000) Shape similarity measure based on correspondence of visual parts. IEEE Trans Pattern Anal Mach Intell 22(10):1185–1190
Jain AK, Murty MN, Flynn PJ (1999) Data clustering: a review. ACM Comput Surv 31(3):265–323
Pedrycz W, Succi G, Reformat M, Musilek P, Bai X (2001) Expressing similarity in software engineering: a neural model. In: Proceedings of the second international workshop on soft computing applied to software engineering, Enschede, The Netherlands, February 2001
Bezdek JC, Keller J, Krisnapuram R, Pal NR (1999) Fuzzy models and algorithms for pattern recognition and image processing. Kluwer Academic Publishing, Boston
Corsini P, Lazzerini B, Marcelloni F (2002) Clustering based on a dissimilarity measure derived from data. In: Proceedings of KES 2002. IOS Press, Crema, pp 885–889
Huang G-B (2003) Learning capability and storage capacity of two-hidden-layer feedforward networks. IEEE Trans Neural Netw 14:274–281
Tamura S, Tateishi M (1997) Capabilities of a four-layered feedforward neural network: four layers versus three. IEEE Trans Neural Netw 8(2):251–255
Duda RO, Hart PE (1974) Pattern classification and scene analysis. Wiley, New York
Bezdek JC (1981) Pattern recognition with fuzzy objective function algorithms. Plenum, New York
Gustafson DE, Kessel WC (1979) Fuzzy clustering with fuzzy covariance matrix. In: Gupta MM, Ragade RK, Yager RR (eds) Advances in fuzzy set theory and applications. North-Holland, Amsterdam, pp 605–620
Roubens M (1978) Pattern classification problems and fuzzy sets. Fuzzy Sets Syst 1:239–253
Windham MP (1985) Numerical classification of proximity data with assignment measures. J Classif 2:157–172
Hathaway RJ, Davenport JW, Bezdek JC (1989) Relational duals of the c-means clustering algorithms. Pattern Recognit 22:205–212
Krishnapuram R, Joshi A, Nasraoui O, Yi L (2001) Low-complexity fuzzy relational clustering algorithms for web mining. IEEE Trans Fuzzy Syst 9(4):595–607
Hathaway RJ, Bezdek JC (1994) NERF c-means: non-Euclidean relational fuzzy clustering. Pattern Recognit 27:429–437
Kolen JF, Hutcheson T (2002) Reducing the time complexity of the fuzzy C-means algorithm. IEEE Trans Fuzzy Syst 10(2):263–267
Pedrycz W, Waletzky J (1997) Fuzzy clustering with partial supervision. IEEE Trans Syst Man and Cybern Part B Cybern 27:787–795
Marcelloni F (2001) Recognition of olfactory signals based on supervised fuzzy c-means and k-NN algorithms. Pattern Recognit Lett 22:1007–1019
Yen GG, Meesad P (2001) An effective neuro-fuzzy paradigm for machinery condition health monitoring. IEEE Trans Syst Man Cybern Part B 31(4):523–536
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Corsini, P., Lazzerini, B. & Marcelloni, F. Combining supervised and unsupervised learning for data clustering. Neural Comput & Applic 15, 289–297 (2006). https://doi.org/10.1007/s00521-006-0030-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-006-0030-5