Skip to main content

Approximation algorithms for two variants of correlation clustering problem

  • Published:
Journal of Combinatorial Optimization Aims and scope Submit manuscript

Abstract

Correlation clustering problem is a clustering problem which has many applications such as protein interaction networks, cross-lingual link detection, communication networks, and social computing. In this paper, we introduce two variants of correlation clustering problem: correlation clustering problem on uncertain graphs and correlation clustering problem with non-uniform hard constrained cluster sizes. Both problems overcome part of the limitations of the existing variants of correlation clustering problem and have practical applications in the real world. We provide a constant approximation algorithm and two approximation algorithms for the former and the latter problem, respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  • Amit N (2004) The bicluster graph editing problem. Diss, Tel Aviv University

  • Ailon N, Avigdor-Elgrabli N, Liberty E, Zuylen AV (2012) Improved approximation algorithms for bipartite correlation clustering. SIAM J Comput 41(5):1110–1121

    Article  MathSciNet  Google Scholar 

  • Achtert E, Böhm C, David J, Kröger P, Zimek A (2010) Global correlation clustering based on the hough transform. Stat Anal Data Min 1(3):111–127

    Article  MathSciNet  Google Scholar 

  • Ahn K J, Cormode G, Guha S, Mcgregor A, Wirth A (2015) Correlation clustering in data streams. In: Proceedings of the 32th International Conference on International Conference on Machine Learning (ICML), pp 2237-2246

  • Ailon N, Charikar M, Newman A (2008) Aggregating inconsistent information: ranking and clustering. J ACM 55(5):1–27 Article No. 23

    Article  MathSciNet  Google Scholar 

  • Arthur D, Vassilvitskii S (2007) k-Means++: the advantages of careful seeding. In: Proceedings of the 18th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pp 1027-1035

  • Aggarwal A, Louis A, Bansal M, Garg N, Gupta N, Gupta S, Jain S (2013) A 3-approximation algorithm for the facility location problem with uniform capacities. Math Program 141(1–2):527–547

    Article  MathSciNet  Google Scholar 

  • Ahmadian S, Norouzi-Fard A, Svensson O, Ward J (2017) Better guarantees for \(k\)-means and euclidean \(k\)-median by primal-dual algorithms. In: Proceedings of the 58th Annual Symposium on Foundations of Computer Science (FOCS), pp 61-72

  • Aardal K, van den Berg PL, Gijswijt D, Li S (2015) Approximation algorithms for hard capacitated k-facility location problems. Eur J Oper Res 242(2):358–368

    Article  MathSciNet  Google Scholar 

  • Bonchi F (2013) Overlapping correlation clustering. Knowl Inf Syst 35(1):1–32

    Article  Google Scholar 

  • Bansal N, Blum A, Chawla S (2004) Correlation clustering. Mach learn 56(1–3):89–113

    Article  MathSciNet  Google Scholar 

  • Byrka J, Fleszar K, Rybicki B, Spoerhase J (2015) Bi-factor approximation algorithms for hard capacitated \(k\)-median problems. In: Proceedings of the 26th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pp 722-736

  • Braverman V, Lang H, Levin K, Monemizadeh M (2016) Clustering problems on sliding windows. In: Proceedings of the 27th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pp 1374-1390

  • Ceccarello M, Fantozzi C, Pietracaprina A, Pucci G, Vandin F (2017) Clustering uncertain graphs. Proc VLDB Endow 11(4):472–484

    Article  Google Scholar 

  • Charikar M, Guruswami V, Wirth A (2005) Clustering with qualitative information. J Comp Syst Sci 71(3):360–383

    Article  MathSciNet  Google Scholar 

  • Chawla S, Makarychev K, Schramm T, Yaroslavtsev G (2015) Near optimal LP rounding algorithm for correlationclustering on complete and complete k-partite graphs. In: Proceedings of the 47th annual ACM symposium on Theory of computing (STOC), pp 219-228

  • Demaine E, Emanuel D, Fiat A, Immorlica N (2006) Correlation clustering in general weighted graphs. Theor Comp Sci 361(2):172–187

    Article  MathSciNet  Google Scholar 

  • Frieze A, Jerrum M (1997) Improved approximation algorithms for maxk-cut and max bisection. Algorithmica 18(1):67–81

    Article  MathSciNet  Google Scholar 

  • Giotis I, Guruswami V (2006) Correlation clustering with a fixed number of clusters. In: Proceedings of the 17th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pp 1167-1176

  • Goemans MX, Williamson DP (1995) Improved approximation algorithms for maximum cut and satisfiability problems using semidefinite programming. J ACM 42(6):1115–1145

    Article  MathSciNet  Google Scholar 

  • Gu Y, Gao C, Cong G, Yu G (2013) Effective and efficient clustering methods for correlated probabilistic graphs. IEEE Trans Knowl Data Eng 26(5):1117–1130

    Article  Google Scholar 

  • Kollios G, Potamias M, Terzi E (2011) Clustering large probabilistic graphs. IEEE Trans Knowl Data Eng 25(2):325–336

    Article  Google Scholar 

  • Li S (2017) On uniform capacitated \(k\)-median beyond the natural LP relaxation. ACM Trans Algorithms 13(2):1–18 Article No. 22

    Article  MathSciNet  Google Scholar 

  • Li M, Xu D, Zhang D, Zhang T (2019) A streaming algorithm for \(k\)-Means with approximate coreset. Asia Pac J Oper Res 36:1950006:1–1950006:18

  • Mathieu C, Schudy W (2010) Correlation clustering with noisy input. In: Proceedings of the 21th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pp 712-728

  • Mathieu C, Sankur O, Schudy W (2010) Online correlation clustering. Comput Stat 21(2):211–229

    MATH  Google Scholar 

  • Newman MEJ (2006) Finding community structure in networks using the eigenvectors of matrices. Phys Rev E 74(3):036104

    Article  MathSciNet  Google Scholar 

  • Puleo GJ, Milenkovic O (2015) Correlation clustering with constrained cluster sizes and extended weights bounds. SIAM J Optim 25(3):1857–1872

    Article  MathSciNet  Google Scholar 

  • Puleo GJ, Milenkovic O (2018) Correlation clustering and biclustering with locally bounded errors. IEEE Trans Inf Theory 64(6):4105–4119

    Article  MathSciNet  Google Scholar 

  • Pal M, Tardos T, Wexler T (2001) Facility location with nonuniform hard capacities. In: Proceedings of the 42nd Annual Symposium on Foundations of Computer Science (FOCS), pp 329-338

  • Swamy C (2004) Correlation clustering: Maximizing agreements via semidefinite programming. In: Proceedings of the 15th Annual ACM-SIAM symposium on Discrete Algorithms (SODA), pp 526-527

  • Williamson ZDP (2009) Deterministic pivoting algorithms for constrained ranking and clustering problems. Math Oper Res 34(3):594–620

    Article  MathSciNet  Google Scholar 

  • Zhang C, Yarkony J, Hamprecht F A (2014) Cell detection and segmentation using correlation clustering. In: Proceedings of the 17th International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI), pp 9-16

Download references

Acknowledgements

The first two authors are supported by National Natural Science Foundation of China (Nos. 11531014, 11871081). The third author is supported by Higher Educational Science and Technology Program of Shandong Province (No. J17KA171) and Natural Science Foundation of Shandong Province (No. ZR2019MA032) of China. The fourth author is supported by National Natural Science Foundation of China (No. 61433012).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Min Li.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

A preliminary version of this paper appeared in Proceedings of the 13th International Conference on Algorithmic Aspects in Information and Management, pp. 159-168, 2019.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ji, S., Xu, D., Li, M. et al. Approximation algorithms for two variants of correlation clustering problem. J Comb Optim 43, 933–952 (2022). https://doi.org/10.1007/s10878-020-00612-1

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10878-020-00612-1

Keywords