Clustering with Lower-Bounded Sizes

Abu-Khzam, Faisal N.; Bazgan, Cristina; Casel, Katrin; Fernau, Henning

doi:10.1007/s00453-017-0374-5

Clustering with Lower-Bounded Sizes

A General Graph-Theoretic Framework

Published: 19 September 2017

Volume 80, pages 2517–2550, (2018)
Cite this article

Algorithmica Aims and scope Submit manuscript

Faisal N. Abu-Khzam¹,
Cristina Bazgan^2,3,
Katrin Casel⁴ &
…
Henning Fernau ORCID: orcid.org/0000-0002-4444-3220⁴

368 Accesses
2 Citations
Explore all metrics

Abstract

Classical clustering problems search for a partition of objects into a fixed number of clusters. In many scenarios, however, the number of clusters is not known or necessarily fixed. Further, clusters are sometimes only considered to be of significance if they have a certain size. We discuss clustering into sets of minimum cardinality k without a fixed number of sets and present a general model for these types of problems. This general framework allows the comparison of different measures to assess the quality of a clustering. We specifically consider nine quality-measures and classify the complexity of the resulting problems with respect to k. Further, we derive some polynomial-time solvable cases for $k=2$ with connections to matching-type problems which, among other graph problems, then are used to compute approximations for larger values of k.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Graph clustering with a constraint on cluster sizes

Article 01 July 2016

An Approximation Algorithm for Graph Clustering with Clusters of Bounded Sizes

Bounds for the Clustering Complexity in a Graph Clustering Problem with Clusters of Bounded Size

Article 28 September 2023

Notes

This covering problem is sometimes also called Unweighted Simplex Matching and is equivalent to $\{K_2,K_3\}$-packing, an old, well studied generalisation of the classical matching problem [7].

References

Abu-Khzam, F.N., Bazgan, C., Casel, K., Fernau, H.: Building clusters with lower-bounded sizes. In: Hong, S. (ed.) 27th International Symposium on Algorithms and Computation, ISAAC, LIPIcs, vol. 64, pp. 4:1–4:13. Schloss Dagstuhl-Leibniz-Zentrum für Informatik (2016)
Aggarwal, G., Panigrahy, R., Feder, T., Thomas, D., Kenthapadi, K., Khuller, S., Zhu, A.: Achieving anonymity via clustering. ACM Trans. Algorithms 6(3), 49 (2010)
Article MathSciNet MATH Google Scholar
Anshelevich, E., Karagiozova, A.: Terminal backup, 3D matching, and covering cubic graphs. SIAM J. Comput. 40(3), 678–708 (2011)
Article MathSciNet MATH Google Scholar
Armon, A.: On min–max $r$-gatherings. Theor. Comput. Sci. 412(7), 573–582 (2011)
Article MathSciNet MATH Google Scholar
Blocki, J., Williams, R.: Resolving the complexity of some data privacy problems. In: Abramsky, S., Gavoille, C., Kirchner, C., auf der Heide, F.M., Spirakis, P.G. (eds.) Proceedings of the 37th International Colloquium Conference on Automata, Languages and Programming, ICALP’10: Part II, LNCS, vol. 6199, pp. 393–404. Springer (2010)
Byun, J.W., Kamra, A., Bertino, E., Li, N.: Efficient $k$-anonymization using clustering techniques. In: Kotagiri, R., Krishna, P.R., Mohania, M., Nantajeewarawat, E. (eds.) Advances in Databases: Concepts, Systems and Applications, LNCS, vol. 4443, pp. 188–200. Springer, Berlin (2007)
Chapter Google Scholar
Cornuéjols, G., Hartvigsen, D., Pulleyblank, W.: Packing subgraphs in a graph. Oper. Res. Lett. 1(4), 139–143 (1982)
Article MathSciNet MATH Google Scholar
Domingo-Ferrer, J., Mateo-Sanz, J.M.: Practical data-oriented microaggregation for statistical disclosure control. IEEE Trans. Knowl. Data Eng. 14(1), 189–201 (2002)
Article Google Scholar
Domingo-Ferrer, J., Sebé, F.: Optimal multivariate 2-microaggregation for microdata protection: a 2-approximation. In: Domingo-Ferrer, J., Franconi, L. (eds.) Privacy in Statistical Databases, PSD’06, LNCS, vol. 4302, pp. 129–138. Springer, Berlin (2006)
Chapter Google Scholar
Edmonds, J., Johnson, E.L.: Matching, Euler tours and the Chinese postman. Math. Program. 5, 88124 (1973)
Article MathSciNet MATH Google Scholar
Ergün, F., Kumar, R., Rubinfeld, R.: Fast approximate PCPs. In: Proceedings of the Thirty-First Annual ACM Symposium on Theory of Computing, 1–4 May 1999, Atlanta, Georgia, USA, pp. 41–50 (1999)
Goemans, M., Williamson, D.: A general approximation technique for constrained forest problems. SIAM J. Comput. 24(2), 296–317 (1995)
Article MathSciNet MATH Google Scholar
Guha, S., Meyerson, A., Munagala, K.: Hierarchical placement and network design problems. In: In Proceedings of the 41th Annual IEEE Symposium on Foundations of Computer Science, FOCS’00, pp. 603–612. IEEE Computer Society (2000)
King, V., Rao, S., Tarjan, R.: A faster deterministic maximum flow algorithm. J. Algorithms 17(3), 447–474 (1994)
Article MathSciNet MATH Google Scholar
Orlin, J.B.: Max flows in $O(nm)$ time, or better. In: Proceedings of the Forty-Fifth Annual ACM Symposium on Theory of Computing, STOC, pp. 765–774. ACM (2013)
Papadimitriou, C.H., Yannakakis, M.: Optimization, approximation, and complexity classes. J. Comput. Syst. Sci. 43, 425–440 (1991)
Article MathSciNet MATH Google Scholar
Samarati, P.: Protecting respondents’ identities in microdata release. IEEE Trans. Knowl. Data Eng. 13(6), 1010–1027 (2001)
Article Google Scholar
Schrijver, A.: Combinatorial Optimization. Springer, Berlin (2003)
MATH Google Scholar
Shalita, A., Zwick, U.: Efficient algorithms for the 2-gathering problem. ACM Trans. Algorithms 6(2), 34 (2010)
Article MathSciNet MATH Google Scholar
Stokes, K.: On computational anonymity. In: Privacy in Statistical Databases—UNESCO Chair in Data Privacy, International Conference, PSD 2012, Palermo, Italy, 26–28 September 2012. Proceedings, pp. 336–347 (2012)
Tovey, C.: A simplified NP-complete satisfiability problem. Discrete Appl. Math. 8(1), 85–89 (1984)
Article MathSciNet MATH Google Scholar
Xu, D., Anshelevich, E., Chiang, M.: On survivable access network design: complexity and algorithms. In: INFOCOM 2008. 27th IEEE International Conference on Computer Communications, Joint Conference of the IEEE Computer and Communications Societies, 13–18 April 2008, Phoenix, AZ, USA, pp. 186–190 (2008)

Download references

Acknowledgements

Katrin Casel and Henning Fernau were supported by the German Science Foundation Deutsche Forschungsgemeinschaft (FE 560/6-1). Faisal Abu-Khzam and Cristina Bazgan were partially supported by the bilateral research cooperation CEDRE between France and Lebanon (Grant Number 30885TM). We are grateful for the helpful comments of the anonymous reviewers.

Author information

Authors and Affiliations

Lebanese American University, Chouran, Beirut, 1102 2801, Lebanon
Faisal N. Abu-Khzam
CNRS, UMR 7243, LAMSADE, Université Paris-Dauphine, PSL Research University, 75016, Paris, France
Cristina Bazgan
Institut Universitaire de France, Paris, France
Cristina Bazgan
Universität Trier, Fachber. 4 - Abteilung Informatikwissenschaften, 54286, Trier, Germany
Katrin Casel & Henning Fernau

Authors

Faisal N. Abu-Khzam
View author publications
You can also search for this author inPubMed Google Scholar
Cristina Bazgan
View author publications
You can also search for this author inPubMed Google Scholar
Katrin Casel
View author publications
You can also search for this author inPubMed Google Scholar
Henning Fernau
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Henning Fernau.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Abu-Khzam, F.N., Bazgan, C., Casel, K. et al. Clustering with Lower-Bounded Sizes. Algorithmica 80, 2517–2550 (2018). https://doi.org/10.1007/s00453-017-0374-5

Download citation

Received: 07 April 2017
Accepted: 08 September 2017
Published: 19 September 2017
Issue Date: September 2018
DOI: https://doi.org/10.1007/s00453-017-0374-5

Keywords

Part of a collection:

Special Issue dedicated to the 60th Birthday of Gregory Gutin

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Clustering with Lower-Bounded Sizes

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Graph clustering with a constraint on cluster sizes

An Approximation Algorithm for Graph Clustering with Clusters of Bounded Sizes

Bounds for the Clustering Complexity in a Graph Clustering Problem with Clusters of Bounded Size

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now