Skip to main content

Two Variants of the OKM for Overlapping Clustering

  • Chapter
Advances in Knowledge Discovery and Management

Part of the book series: Studies in Computational Intelligence ((SCI,volume 292))

Abstract

This paper deals with overlapping clustering and presents two extensions of the approach OKM denoted as OKMED andWOKM. OKMED generalizes the well known k-medoid method to overlapping clustering and help in organizing data with any proximity matrix as input. WOKM (Weighted-OKM) proposes a model with local weighting of the clusters; this variant is suitable for overlapping clustering since a single data can matches with multiple classes according to different features. On text clustering, we show that OKMED has a behavior similar to OKM but offers to use metrics other than euclidean distance. Then we observe significant improvement using the weighted extension of OKM.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Apté, C., Damerau, F., Weiss, S.M.: Automated learning of decision rules for text categorization. ACM Trans. Inf. Syst. 12(3), 233–251 (1994), http://doi.acm.org/10.1145/183422.183423

    Article  Google Scholar 

  • Banerjee, A., Krumpelman, C., Ghosh, J., Basu, S., Mooney, R.J.: Model-based overlapping clustering. In: KDD 2005: Proceeding of the eleventh ACM SIGKDD, pp. 532–537. ACM Press, New York (2005a), http://doi.acm.org/10.1145/1081870.1081932

    Chapter  Google Scholar 

  • Banerjee, A., Merugu, S., Dhillon, I., Ghosh, J.: Clustering with Bregman Divergences. J. Mach. Learn. Res. 6, 1705–1749 (2005b)

    MATH  MathSciNet  Google Scholar 

  • Bertrand, P., Janowitz, M.F.: The k-weak Hierarchical Representations: An Extension of the Indexed Closed Weak Hierarchies. Discrete Applied Mathematics 127(2), 199–220 (2003)

    Article  MATH  MathSciNet  Google Scholar 

  • Bezdek, J.C.: Pattern Recognition with Fuzzy Objective Function Algorithms. Plenum Press, New York (1981)

    MATH  Google Scholar 

  • Boutell, M.R., Luo, J., Shen, X., Brown, C.M.: Learning multi-label scene classification. Pattern Recognition 37(9), 1757–1771 (2004), http://dx.doi.org/10.1016/j.patcog.2004.03.009

    Article  Google Scholar 

  • Celleux, G., Govaert, G.: A Classification EM Algorithm for Clustering and Two Stochastic Versions. Computational Statistics and Data Analysis 14(3), 315–332 (1992)

    Article  MathSciNet  Google Scholar 

  • Chan, E.Y., Ching, W.-K., Ng, M.K., Huang, J.Z.: An optimization algorithm for clustering using weighted dissimilarity measures. Pattern Recognition 37(5), 943–952 (2004)

    Article  MATH  Google Scholar 

  • Cleuziou, G.: OKM: une extension des k-moyennes pour la recherche de classes recouvrantes. In: EGC 2007, Cépaduès edn., Namur, Belgique. Revue des Nouvelles Technologies de l’Information, vol. 2 (2007)

    Google Scholar 

  • Cleuziou, G.: An Extended Version of the k-Means Method for Overlapping Clustering. In: 19th ICPR Conference, Tampa, Florida, USA, pp. 1–4 (2008)

    Google Scholar 

  • Cleuziou, G., Sublemontier, J.-H.: Etude comparative de deux approches de classification recouvrante: Moc vs. Okm. In: 8èmes Journées Francophones d’Extraction et de Gestion des Connaissances, Cépaduès edn. Revue des Nouvelles Technologies de l’Information, vol. 2 (2008)

    Google Scholar 

  • Dattola, R.: A fast algorithm for automatic classification. Technical report, Report ISR-14 to the National Science Foundation, Section V, Cornell University, Department of Computer Science (1968)

    Google Scholar 

  • Dhillon, I.S.: Kernel k-means, spectral clustering and normalized cuts, pp. 551–556. ACM Press, New York (2004)

    Google Scholar 

  • Diday, E.: Orders and overlapping clusters by pyramids. Technical report, INRIA num.730, Rocquencourt 78150, France (1987)

    Google Scholar 

  • Diday, E., Govaert, G.: Classification avec distances adaptatives. RAIRO 11(4), 329–349 (1977)

    MathSciNet  Google Scholar 

  • Newman, D.J., Hettich, S., Blake, C.L., Merz, C.J.: UCI Repository of machine learning databases. University of California, Irvine, Dept. of Information and Computer Sciences (1998), http://www.ics.uci.edu/~mlearn/MLRepository.html

  • Pelleg, D., Moore, A.: X-means: Extending K-means with Efficient Estimation of the Number of Clusters. In: Proceedings of the Seventeenth International Conference on Machine Learning, pp. 727–734. Morgan Kaufmann, San Francisco (2000)

    Google Scholar 

  • Elisseeff, A., Weston, J.: A Kernel Method for Multi-Labelled Classification. In: Advances in Neural Information Processing Systems, vol. 14, pp. 681–687. MIT Press, Cambridge (2001)

    Google Scholar 

  • Jardine, N., Sibson, R.: Mathematical Taxonomy. John Wiley and Sons Ltd., London (1971)

    MATH  Google Scholar 

  • Kaufman, L., Rousseeuw, P.J.: Clustering by means of medoids. In: Dodge, Y. (ed.) Statistical Data Analysis based on the L1 Norm, pp. 405–416 (1987)

    Google Scholar 

  • Kohonen, T.: Self-Organization and Associative Memory. Springer, Heidelberg (1984)

    MATH  Google Scholar 

  • Likas, A., Vlassis, N., Verbeek, J.: The Global K-means Clustering Algorithm. Pattern Recognition 36, 451–461 (2003)

    Article  Google Scholar 

  • MacQueen, J.: Some methods for classification and analysis of multivariate observations. In: Proceedings of the Fifth Berkeley Symposium on Mathematical statistics and probability, vol. 1, pp. 281–297. University of California Press, Berkeley (1967)

    Google Scholar 

  • Peña, J., Lozano, J., Larrañaga, P.: An empirical comparison of four initialization methods for the k-means algorithm. Pattern Recognition Letters 20(50), 1027–1040 (1999)

    Article  Google Scholar 

  • Tsoumakas, G., Katakis, I., Vlahavas, I.: Effective and Efficient Multilabel Classification in Domains with Large Number of Labels. In: Proc. ECML/PKDD 2008 Workshop on Mining Multidimensional Data, MMD 2008 (2008)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Cleuziou, G. (2010). Two Variants of the OKM for Overlapping Clustering. In: Guillet, F., Ritschard, G., Zighed, D.A., Briand, H. (eds) Advances in Knowledge Discovery and Management. Studies in Computational Intelligence, vol 292. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-00580-0_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-00580-0_9

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-00579-4

  • Online ISBN: 978-3-642-00580-0

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics