Two Variants of the OKM for Overlapping Clustering

Cleuziou, Guillaume

doi:10.1007/978-3-642-00580-0_9

Guillaume Cleuziou⁵

Part of the book series: Studies in Computational Intelligence ((SCI,volume 292))

905 Accesses
6 Citations

Abstract

This paper deals with overlapping clustering and presents two extensions of the approach OKM denoted as OKMED andWOKM. OKMED generalizes the well known k-medoid method to overlapping clustering and help in organizing data with any proximity matrix as input. WOKM (Weighted-OKM) proposes a model with local weighting of the clusters; this variant is suitable for overlapping clustering since a single data can matches with multiple classes according to different features. On text clustering, we show that OKMED has a behavior similar to OKM but offers to use metrics other than euclidean distance. Then we observe significant improvement using the weighted extension of OKM.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Apté, C., Damerau, F., Weiss, S.M.: Automated learning of decision rules for text categorization. ACM Trans. Inf. Syst. 12(3), 233–251 (1994), http://doi.acm.org/10.1145/183422.183423
Article Google Scholar
Banerjee, A., Krumpelman, C., Ghosh, J., Basu, S., Mooney, R.J.: Model-based overlapping clustering. In: KDD 2005: Proceeding of the eleventh ACM SIGKDD, pp. 532–537. ACM Press, New York (2005a), http://doi.acm.org/10.1145/1081870.1081932
Chapter Google Scholar
Banerjee, A., Merugu, S., Dhillon, I., Ghosh, J.: Clustering with Bregman Divergences. J. Mach. Learn. Res. 6, 1705–1749 (2005b)
MATH MathSciNet Google Scholar
Bertrand, P., Janowitz, M.F.: The k-weak Hierarchical Representations: An Extension of the Indexed Closed Weak Hierarchies. Discrete Applied Mathematics 127(2), 199–220 (2003)
Article MATH MathSciNet Google Scholar
Bezdek, J.C.: Pattern Recognition with Fuzzy Objective Function Algorithms. Plenum Press, New York (1981)
MATH Google Scholar
Boutell, M.R., Luo, J., Shen, X., Brown, C.M.: Learning multi-label scene classification. Pattern Recognition 37(9), 1757–1771 (2004), http://dx.doi.org/10.1016/j.patcog.2004.03.009
Article Google Scholar
Celleux, G., Govaert, G.: A Classification EM Algorithm for Clustering and Two Stochastic Versions. Computational Statistics and Data Analysis 14(3), 315–332 (1992)
Article MathSciNet Google Scholar
Chan, E.Y., Ching, W.-K., Ng, M.K., Huang, J.Z.: An optimization algorithm for clustering using weighted dissimilarity measures. Pattern Recognition 37(5), 943–952 (2004)
Article MATH Google Scholar
Cleuziou, G.: OKM: une extension des k-moyennes pour la recherche de classes recouvrantes. In: EGC 2007, Cépaduès edn., Namur, Belgique. Revue des Nouvelles Technologies de l’Information, vol. 2 (2007)
Google Scholar
Cleuziou, G.: An Extended Version of the k-Means Method for Overlapping Clustering. In: 19th ICPR Conference, Tampa, Florida, USA, pp. 1–4 (2008)
Google Scholar
Cleuziou, G., Sublemontier, J.-H.: Etude comparative de deux approches de classification recouvrante: Moc vs. Okm. In: 8èmes Journées Francophones d’Extraction et de Gestion des Connaissances, Cépaduès edn. Revue des Nouvelles Technologies de l’Information, vol. 2 (2008)
Google Scholar
Dattola, R.: A fast algorithm for automatic classification. Technical report, Report ISR-14 to the National Science Foundation, Section V, Cornell University, Department of Computer Science (1968)
Google Scholar
Dhillon, I.S.: Kernel k-means, spectral clustering and normalized cuts, pp. 551–556. ACM Press, New York (2004)
Google Scholar
Diday, E.: Orders and overlapping clusters by pyramids. Technical report, INRIA num.730, Rocquencourt 78150, France (1987)
Google Scholar
Diday, E., Govaert, G.: Classification avec distances adaptatives. RAIRO 11(4), 329–349 (1977)
MathSciNet Google Scholar
Newman, D.J., Hettich, S., Blake, C.L., Merz, C.J.: UCI Repository of machine learning databases. University of California, Irvine, Dept. of Information and Computer Sciences (1998), http://www.ics.uci.edu/~mlearn/MLRepository.html
Pelleg, D., Moore, A.: X-means: Extending K-means with Efficient Estimation of the Number of Clusters. In: Proceedings of the Seventeenth International Conference on Machine Learning, pp. 727–734. Morgan Kaufmann, San Francisco (2000)
Google Scholar
Elisseeff, A., Weston, J.: A Kernel Method for Multi-Labelled Classification. In: Advances in Neural Information Processing Systems, vol. 14, pp. 681–687. MIT Press, Cambridge (2001)
Google Scholar
Jardine, N., Sibson, R.: Mathematical Taxonomy. John Wiley and Sons Ltd., London (1971)
MATH Google Scholar
Kaufman, L., Rousseeuw, P.J.: Clustering by means of medoids. In: Dodge, Y. (ed.) Statistical Data Analysis based on the L1 Norm, pp. 405–416 (1987)
Google Scholar
Kohonen, T.: Self-Organization and Associative Memory. Springer, Heidelberg (1984)
MATH Google Scholar
Likas, A., Vlassis, N., Verbeek, J.: The Global K-means Clustering Algorithm. Pattern Recognition 36, 451–461 (2003)
Article Google Scholar
MacQueen, J.: Some methods for classification and analysis of multivariate observations. In: Proceedings of the Fifth Berkeley Symposium on Mathematical statistics and probability, vol. 1, pp. 281–297. University of California Press, Berkeley (1967)
Google Scholar
Peña, J., Lozano, J., Larrañaga, P.: An empirical comparison of four initialization methods for the k-means algorithm. Pattern Recognition Letters 20(50), 1027–1040 (1999)
Article Google Scholar
Tsoumakas, G., Katakis, I., Vlahavas, I.: Effective and Efficient Multilabel Classification in Domains with Large Number of Labels. In: Proc. ECML/PKDD 2008 Workshop on Mining Multidimensional Data, MMD 2008 (2008)
Google Scholar

Download references

Author information

Authors and Affiliations

LIFO, University of Orléans,
Guillaume Cleuziou

Authors

Guillaume Cleuziou
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Polytechnic School of Nantes University, Nantes, France
Fabrice Guillet & Henri Briand &
Université de Genève, Genève, Switzerland
Gilbert Ritschard
Université Lumi‘́ere Lyon 2, Bron, France
Djamel Abdelkader Zighed

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Cleuziou, G. (2010). Two Variants of the OKM for Overlapping Clustering. In: Guillet, F., Ritschard, G., Zighed, D.A., Briand, H. (eds) Advances in Knowledge Discovery and Management. Studies in Computational Intelligence, vol 292. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-00580-0_9

Download citation

DOI: https://doi.org/10.1007/978-3-642-00580-0_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-00579-4
Online ISBN: 978-3-642-00580-0
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics