Skip to main content
Log in

Spectral co-clustering documents and words using fuzzy K-harmonic means

  • Original Article
  • Published:
International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Abstract

This paper analyzes the main steps of spectral co-clustering documents and words, finds out its cause of sensitivity to input order, and presents a modified method of spectral co-clustering documents and words based on fuzzy K-harmonic means. This method consists of two steps. The first step constructs Laplacian matrix which is insensitive to input order. The second step exploits fuzzy K-harmonic means algorithm instead of K-means algorithm to obtain clustering results. Fuzzy K-harmonic means algorithm uses fuzzy weight distance while calculating the distance between each data points and cluster centers. The experiments show that the proposed method not only is insensitive to input order, but also can improve the accuracy and robustness of clustering results.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  1. Luxburg UV (2007) A tutorial on spectral clustering. Stat Comput 17(4):395–416

    Article  MathSciNet  Google Scholar 

  2. Ng AY, Jordan MI, Weiss Y(2002) On spectral clustering: analysis and an algorithm. In: Proceedings of the conference on advances in neural information processing systems. Mas-sachusetts, pp 849–856

  3. Tian Z, Li XB, Ju YW (2007) Spectral clustering based on matrix perturbation theory. Sci China Ser F Inf Sci 50(1):63–81

    Article  MathSciNet  MATH  Google Scholar 

  4. Donath WE, Hoffman AJ (1973) Lower bound for the partitioning of graphs. IBM J Res Dev 17:420–425

    Article  MathSciNet  MATH  Google Scholar 

  5. Fiedler M (1973) Algebraic connectivity of graphs. Czechoslovak Math J 23(2):298–305

    MathSciNet  Google Scholar 

  6. Shi J, Malik J (2000) Normalized cuts and image segmentation. IEEE Trans Pattern Anal Mach Intell 22(8):888–905

    Article  Google Scholar 

  7. Ng AY, Jordan ML, Weiss Y (2001) On spectral clustering: analysis and an algorithm. In: Advances in neural information processing systems, vol 14, pp 849–856

  8. Prieto R, Jiang J, Choi CH (2003) A new spectral clustering algorithm for large training sets. In: International conference on machine learning and cybernetics, China, pp 147–152

  9. Fern XZ, Brodley CE (2004) Solving cluster ensemble problems by bipartite graph partitioning. In: Proceedings of international conference on machine learning, vol 21, New York, pp 36–44

  10. Sanguinetti G, Laidler J, Lawrence N (2005) Automatic Determination of the number of clusters using spectral algorithms. In: Proceedings of IEEE machine learning for signal processing, USA, pp 28–30

  11. Fowlkes C, Belongie S, Chung F (2007) Spectral grouping using the Nystrom method. IEEE Trans Pattern Anal Mach Intell 26(2):217–225

    Google Scholar 

  12. Xu S, Lu ZM, Gu GC (2009) Two spectral algorithms for ensembling document clusters. Acta Autom Sin 35(7):997–1002

    Google Scholar 

  13. Yeung DS, Wang X (2002) Improving performance of similarity-based clustering by feature weight learning. IEEE Trans Pattern Anal Mach Intell 24(4):556–561

    Google Scholar 

  14. Wang XZ, Wang YD, Wang LJ (2004) Improving fuzzy c-means clustering based on feature-weight learning. Pattern Recognit Lett 25(10):1123–1132

    Google Scholar 

  15. Wang XZ, Dong CR, Fan TG (2007) Training T-S Norm neural networks to refine weights for fuzzy if-then rules. Neurocomputing 70(13–15):2581–2587

    Google Scholar 

  16. Xing HJ, Hu BG (2008) An adaptive fuzzy c-means clustering-based mixtures of experts model for unlabeled data classification. Neurocomputing 71(4–6):1008–1021

    Google Scholar 

  17. Wang XZ, Dong CR (2009) Improving generalization of fuzzy if-then rules by maximizing fuzzy entropy. IEEE Trans Fuzzy Syst 17(3):556–567

    Google Scholar 

  18. Liang J, Song W Clustering based on Steiner points. Int J Mach Learn Cyber. doi:10.1007/s13042-011-0047-7

  19. Graaff AJ, Engelbrecht AP Clustering data in stationary environments with a local network neighborhood artificial immune system. Int J Mach Learn Cyber. doi:10.1007/s13042-011-0041-0

  20. Guo G, Chen S, Chen L Soft subspace clustering with an improved feature weight self-adjustment mechanism. Int J Mach Learn Cyber. doi:10.1007/s13042-011-0038-8

  21. Wang XZ, He YL, Dong LC, Zhao HY (2011) Particle swarm optimization for determining fuzzy measures from data. Inf Sci 181(19):4230–4252

    Google Scholar 

  22. Kluger Y, Basri R, Chang JT, Gerstein M (2003) Spectral biclustering of microarray data: coclustering genes and conditions. Genome Res 13(4):703–716

    Google Scholar 

  23. Guan J, Qiu G (2005) Spectral images and features co-clustering with application to content-based image retrieval. In: 7th IEEE workshop on multimedia signal processing, Shanghai, pp 1–4

  24. Wieling M, Nerbonne J (2009) Bipartite spectral graph partitioning to co-cluster varieties and sound correspondences in dialectology. In: Proceedings of the 2009 workshop on graph-based methods for natural language processing, Singapore, pp 14–22

  25. Xu G, Zong Y, Dolog P, Zhang Y (2010) Co-clustering analysis of weblogs using bipartite spectral projection approach. In: Proceedings of 14th KES, Cardiff, pp 398–407

  26. Green NS (2010) Evolutionary spectral co-clustering. Dissertation, Rochester Institute of Technology

  27. Dhillon I (2001) Co-clustering documents and words using bipartite spectral graph partitioning. In: Proceedings of the seventh ACM SIGKDD international conference on knowledge discovery and data mining, San Francisco, pp 269–274

  28. Zhang B, Hsu M, Dayal U (1999) K-harmonic means-a data clustering algorithm. http://www.hpl.hp.com/techreports/1999/HPL-1999-124.pdf

  29. Zhang B (2000) Generalized K-harmonic means-boosting in unsupervised learning. http://www.hpl.hp.com/techreports/2000/HPL-2000-137.html

  30. Strehl A, Ghosh J (2002) Cluster ensembles-a knowledge reuse framework for combining partitionings. J Mach Learn Res 3:583–617

    MathSciNet  Google Scholar 

Download references

Acknowledgment

The author would like to express thanks to the anonymous reviewers for their insightful comments that helped improve this paper. This work is supported by National Natural Science Funds (No. 61175053,61073133), Innovative Team and Key Scientific Research Projects of Ministry of Education (No. 2011ZD010).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mingyu Lu.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Liu, N., Chen, F. & Lu, M. Spectral co-clustering documents and words using fuzzy K-harmonic means. Int. J. Mach. Learn. & Cyber. 4, 75–83 (2013). https://doi.org/10.1007/s13042-012-0077-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13042-012-0077-9

Keywords

Navigation