Skip to main content

Fuzzifying Clustering Algorithms: The Case Study of MajorClust

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4827))

Abstract

Among various document clustering algorithms that have been proposed so far, the most useful are those that automatically reveal the number of clusters and assign each target document to exactly one cluster. However, in many real situations, there not exists an exact boundary between different clusters. In this work, we introduce a fuzzy version of the MajorClust algorithm. The proposed clustering method assigns documents to more than one category by taking into account a membership function for both, edges and nodes of the corresponding underlying graph. Thus, the clustering problem is formulated in terms of weighted fuzzy graphs. The fuzzy approach permits to decrease some negative effects which appear in clustering of large-sized corpora with noisy data.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. MacKay, D.J.C.: Information Theory, Inference and Learning Algorithms. Cambridge University Press, Cambridge (2003)

    MATH  Google Scholar 

  2. Mirkin, B.G.: Mathematical Classification and Clustering. Springer, Heidelberg (1996)

    MATH  Google Scholar 

  3. MacQueen, J.B.: Some methods for classification and analysis of multivariate observations. In: Proc. of 5-th Berkeley Symposium on Mathematical Statistics and Probability, pp. 281–297. University of California Press, Berkeley (1967)

    Google Scholar 

  4. Kernighan, B.W., Lin, S.: An efficient heuristic procedure for partitioning graphs. Bell Systems Technical Journal 49(2), 291–308 (1970)

    Google Scholar 

  5. Stein, B., Nigemman, O.: On the nature of structure and its identification. In: Widmayer, P., Neyer, G., Eidenbenz, S. (eds.) WG 1999. LNCS, vol. 1665, pp. 122–134. Springer, Heidelberg (1999)

    Chapter  Google Scholar 

  6. Dunn, J.C.: A fuzzy relative of the isodata process and its use in detecting compact well-separated clusters. Journal of Cybernetics 3, 32–57 (1973)

    MATH  MathSciNet  Google Scholar 

  7. Bezdek, J.C.: Pattern Recognition with Fuzzy Objective Function Algoritms. Plenum Press, New York (1981)

    Google Scholar 

  8. Stein, B., Busch, M.: Density-based cluster algorithms in low-dimensional and high-dimensional applications. In: Proc. of Second International Workshop on Text-Based Information Retrieval, TIR 2005, pp. 45–56 (2005)

    Google Scholar 

  9. Stein, B., Meyer, S.: Automatic document categorization. In: Günter, A., Kruse, R., Neumann, B. (eds.) KI 2003. LNCS (LNAI), vol. 2821, pp. 254–266. Springer, Heidelberg (2003)

    Google Scholar 

  10. Alexandrov, M., Gelbukh, A., Rosso, P.: An approach to clustering abstracts. In: Montoyo, A., Muńoz, R., Métais, E. (eds.) NLDB 2005. LNCS, vol. 3513, pp. 275–285. Springer, Heidelberg (2005)

    Google Scholar 

  11. Pinto, D., Rosso, P.: On the relative hardness of clustering corpora. In: Proc. of TSD 2007 Conference. LNCS, Springer, Heidelberg (to appear, 2007)

    Google Scholar 

  12. Neville, J., Adler, M., Jensen, D.: Clustering relational data using attribute and link information. In: Proc. of the Text Mining and Link Analysis Workshop, IJCAI 2003 (2003)

    Google Scholar 

  13. Levner, E., Alcaide, D., Sicilia, J.: Text classification using the fuzzy borda method and semantic grades. In: Proc. of WILF-2007 (CLIP-2007). LNCS (LNAI), vol. 4578, pp. 422–429. Springer, Heidelberg (2007)

    Google Scholar 

  14. Levner, E., Alcaide, D.: Environmental risk ranking: Theory and applications for emergency planning. Scientific Israel - Technological Advantages 8(1-2), 11–21 (2006)

    Google Scholar 

  15. Koopmans, T.C., Beckman, M.: Assignment problems and the location of economic activities. Econometrica 25, 53–76 (1957)

    Article  MATH  MathSciNet  Google Scholar 

  16. Singh, S.P., Sharma, R.R.K.: A review of different approaches to the facility layout problem. International Journal of Advanced Manufacutring Technology 30(5-6), 426–433 (2006), http://dx.doi.org/10.1007/s00170-005-0087-9

    Google Scholar 

  17. Klawonn, F., Höpnner, F.: What is fuzzy about fuzzy clustering-understanding and improving the concept of the fuzzifier. Advances in Intelligent Data Analysis V, 254–264 (2003)

    Google Scholar 

  18. Mamdani, E.H.: Application of fuzzy logic to approximate reasoning using linguistic synthesis. In: Proc. of the sixth international symposium on Multiple-valued logic, pp. 196–202 (1976)

    Google Scholar 

  19. Zimmermann, H.J.: Fuzzy Sets, Decision Making and Expert Systems. Kluwer Academic Publishers, Boston (1987)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Alexander Gelbukh Ángel Fernando Kuri Morales

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Levner, E., Pinto, D., Rosso, P., Alcaide, D., Sharma, R.R.K. (2007). Fuzzifying Clustering Algorithms: The Case Study of MajorClust. In: Gelbukh, A., Kuri Morales, Á.F. (eds) MICAI 2007: Advances in Artificial Intelligence. MICAI 2007. Lecture Notes in Computer Science(), vol 4827. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-76631-5_78

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-76631-5_78

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-76630-8

  • Online ISBN: 978-3-540-76631-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics