Skip to main content
Log in

Combining supervised and unsupervised learning for data clustering

  • Original Article
  • Published:
Neural Computing & Applications Aims and scope Submit manuscript

Abstract

Clustering aims to partition a data set into homogenous groups which gather similar objects. Object similarity, or more often object dissimilarity, is usually expressed in terms of some distance function. This approach, however, is not viable when dissimilarity is conceptual rather than metric. In this paper, we propose to extract the dissimilarity relation directly from the available data. To this aim, we train a feedforward neural network with some pairs of points with known dissimilarity. Then, we use the dissimilarity measure generated by the network to guide a new unsupervised fuzzy relational clustering algorithm. An artificial data set and a real data set are used to show how the clustering algorithm based on the neural dissimilarity outperforms some widely used (possibly partially supervised) clustering algorithms based on spatial dissimilarity.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Notes

  1. Actually, our method can deal with both similarity and dissimilarity relations.

References

  1. Jain AK, Dubes RC (1988) Algorithms for clustering. Prentice-Hall, Englewood Cliffs

    MATH  Google Scholar 

  2. Kaufman L, Rousseeuw PJ (1990) Finding groups in data. An introduction to cluster analysis. Wiley, Canada

    Google Scholar 

  3. Everitt BS, Landau S, Leese M (2001) Cluster analysis. Arnold, London

    Google Scholar 

  4. Hathaway RJ, Bezdek JC, Hu Y (2000) Generalized fuzzy c-means clustering strategies using Lp norm distances. IEEE Trans Fuzzy Syst 8(5):576 –582

    Article  Google Scholar 

  5. Dave RN, Krishnapuram R (1997) Robust clustering methods: a unified view. IEEE Trans Fuzzy Syst 5(2):270–293

    Article  Google Scholar 

  6. Karayiannis NB, Randolph-Gips MM (2003) Soft learning vector quantization and clustering algorithms based on non-Euclidean norms: multinorm algorithms. IEEE Trans Neural Netw 14:89–102

    Article  Google Scholar 

  7. Valentin D, Abdi H, O’Toole AJ, Cottrell GW (1994) Connectionist models of face processing: a survey. Pattern Recognit 27:1208–1230

    Article  Google Scholar 

  8. Kamgar-Parsi B, Jain AK (1999) Automatic aircraft recognition: toward using human similarity measure in a recognition system. In: IEEE Computer Society conference on computer vision and pattern recognition, pp 268–273

  9. Santini S, Jain R (1999) Similarity measures. IEEE Trans Pattern Anal Mach Intell 21(9):871–883

    Article  Google Scholar 

  10. Latecki LJ, Lakamper R (2000) Shape similarity measure based on correspondence of visual parts. IEEE Trans Pattern Anal Mach Intell 22(10):1185–1190

    Article  Google Scholar 

  11. Jain AK, Murty MN, Flynn PJ (1999) Data clustering: a review. ACM Comput Surv 31(3):265–323

    Article  Google Scholar 

  12. Pedrycz W, Succi G, Reformat M, Musilek P, Bai X (2001) Expressing similarity in software engineering: a neural model. In: Proceedings of the second international workshop on soft computing applied to software engineering, Enschede, The Netherlands, February 2001

  13. Bezdek JC, Keller J, Krisnapuram R, Pal NR (1999) Fuzzy models and algorithms for pattern recognition and image processing. Kluwer Academic Publishing, Boston

    MATH  Google Scholar 

  14. Corsini P, Lazzerini B, Marcelloni F (2002) Clustering based on a dissimilarity measure derived from data. In: Proceedings of KES 2002. IOS Press, Crema, pp 885–889

  15. Huang G-B (2003) Learning capability and storage capacity of two-hidden-layer feedforward networks. IEEE Trans Neural Netw 14:274–281

    Article  Google Scholar 

  16. Tamura S, Tateishi M (1997) Capabilities of a four-layered feedforward neural network: four layers versus three. IEEE Trans Neural Netw 8(2):251–255

    Article  Google Scholar 

  17. Duda RO, Hart PE (1974) Pattern classification and scene analysis. Wiley, New York

    Google Scholar 

  18. Bezdek JC (1981) Pattern recognition with fuzzy objective function algorithms. Plenum, New York

    MATH  Google Scholar 

  19. Gustafson DE, Kessel WC (1979) Fuzzy clustering with fuzzy covariance matrix. In: Gupta MM, Ragade RK, Yager RR (eds) Advances in fuzzy set theory and applications. North-Holland, Amsterdam, pp 605–620

    Google Scholar 

  20. Roubens M (1978) Pattern classification problems and fuzzy sets. Fuzzy Sets Syst 1:239–253

    Article  MATH  MathSciNet  Google Scholar 

  21. Windham MP (1985) Numerical classification of proximity data with assignment measures. J Classif 2:157–172

    Article  Google Scholar 

  22. Hathaway RJ, Davenport JW, Bezdek JC (1989) Relational duals of the c-means clustering algorithms. Pattern Recognit 22:205–212

    Article  MATH  MathSciNet  Google Scholar 

  23. Krishnapuram R, Joshi A, Nasraoui O, Yi L (2001) Low-complexity fuzzy relational clustering algorithms for web mining. IEEE Trans Fuzzy Syst 9(4):595–607

    Article  Google Scholar 

  24. Hathaway RJ, Bezdek JC (1994) NERF c-means: non-Euclidean relational fuzzy clustering. Pattern Recognit 27:429–437

    Article  Google Scholar 

  25. Kolen JF, Hutcheson T (2002) Reducing the time complexity of the fuzzy C-means algorithm. IEEE Trans Fuzzy Syst 10(2):263–267

    Article  Google Scholar 

  26. Pedrycz W, Waletzky J (1997) Fuzzy clustering with partial supervision. IEEE Trans Syst Man and Cybern Part B Cybern 27:787–795

    Article  Google Scholar 

  27. Marcelloni F (2001) Recognition of olfactory signals based on supervised fuzzy c-means and k-NN algorithms. Pattern Recognit Lett 22:1007–1019

    Article  MATH  Google Scholar 

  28. Yen GG, Meesad P (2001) An effective neuro-fuzzy paradigm for machinery condition health monitoring. IEEE Trans Syst Man Cybern Part B 31(4):523–536

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Francesco Marcelloni.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Corsini, P., Lazzerini, B. & Marcelloni, F. Combining supervised and unsupervised learning for data clustering. Neural Comput & Applic 15, 289–297 (2006). https://doi.org/10.1007/s00521-006-0030-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-006-0030-5

Keywords

Navigation