Skip to main content

Distributed Detection of Clusters of Arbitrary Size

  • Conference paper
  • First Online:
Structural Information and Communication Complexity (SIROCCO 2021)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12810))

  • 370 Accesses

Abstract

Graph clustering is a fundamental technique in data analysis with a vast number of applications in computer science and statistics. In theoretical computer science, the problem of graph clustering has received significant research attention over the past two decades, which has led to pivotal algorithmic breakthroughs. However, the design of most graph clustering algorithms is based on complicated techniques from computational optimisation, which are not applicable for processing massive data sets stored in physically remote locations.

In this work we present a novel distributed algorithm for graph clustering. Most of the previous algorithms only work for graphs with balanced-sized clusters, which restrict their applications in many practical settings. Our proposed algorithm works for graphs with clusters of arbitrary size and its performance is analysed with respect to every individual cluster. In addition, our algorithm is easy to implement, and only requires a poly-logarithmic number of rounds for many graphs occurring in practice.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    Every node v can randomly select a number between \(\left[ 1, \mathrm {poly}(n)\right] \), such that, with high probability, those numbers are distinct.

  2. 2.

    The minimum does not play any special role here, it is only used to guarantee consensus among all nodes in the same cluster. The maximum ID works just as fine.

  3. 3.

    For a more detailed discussion of the connection between the sets \(\{f_i\}\), \(\{\chi _{S_i}\}\), \(\{ \widetilde{\chi }_i\}\) we refer the reader to [19].

References

  1. Allen-Zhu, Z., Lattanzi, S., Mirrokni, V.S.: A local algorithm for finding well-connected clusters. In: 30th International Conference on Machine Learning (ICML 2013), pp. 396–404 (2013)

    Google Scholar 

  2. Becchetti, L., Clementi, A.E.F., Manurangsi, P., Natale, E., Pasquale, F., Raghavendra, P., Trevisan, L.: Average whenever you meet: Opportunistic protocols for community detection. In: 26th European Symposium on Algorithms (ESA’18). LIPIcs, vol. 112, pp. 7:1–7:13. Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2018). https://doi.org/10.4230/LIPIcs.ESA.2018.7

  3. Becchetti, L., Clementi, A.E.F., Natale, E., Pasquale, F., Trevisan, L.: Find your place: simple distributed algorithms for community detection. SIAM J. Comput. 49(4), 821–864 (2020). https://doi.org/10.1137/19M1243026

    Article  MathSciNet  MATH  Google Scholar 

  4. Becchetti, L., Cruciani, E., Pasquale, F., Rizzo, S.: Step-by-step community detection in volume-regular graphs. Theoret. Comput. Sci. 847, 49–67 (2020). https://doi.org/10.1016/j.tcs.2020.09.036

    Article  MathSciNet  MATH  Google Scholar 

  5. Boyd, S.P., Ghosh, A., Prabhakar, B., Shah, D.: Randomized gossip algorithms. IEEE Trans. Inf. Theory 52(6), 2508–2530 (2006). https://doi.org/10.1109/TIT.2006.874516

    Article  MathSciNet  MATH  Google Scholar 

  6. Chang, Y., Saranurak, T.: Improved distributed expander decomposition and nearly optimal triangle enumeration. In: Robinson, P., Ellen, F. (eds.) Proceedings of the 2019 ACM Symposium on Principles of Distributed Computing, PODC 2019, Toronto, ON, Canada, 29 July–2 August 2019, pp. 66–73. ACM (2019). https://doi.org/10.1145/3293611.3331618

  7. Chang, Y., Saranurak, T.: Deterministic distributed expander decomposition and routing with applications in distributed derandomization. In: 61st Annual IEEE Symposium on Foundations of Computer Science (FOCS 2020), pp. 377–388. IEEE (2020). https://doi.org/10.1109/FOCS46700.2020.00043

  8. Chen, J., Sun, H., Woodruff, D.P., Zhang, Q.: Communication-optimal distributed clustering. In: Lee, D.D., Sugiyama, M., von Luxburg, U., Guyon, I., Garnett, R. (eds.) 29th Advances in Neural Information Processing Systems (NeurIPS 2016), pp. 3720–3728 (2016)

    Google Scholar 

  9. Czumaj, A., Peng, P., Sohler, C.: Testing cluster structure of graphs. In: 47th Annual ACM Symposium on Theory of Computing (STOC 2015), pp. 723–732. ACM (2015). https://doi.org/10.1145/2746539.2746618

  10. Fortunato, S.: Community detection in graphs. Phys. Rep. 486(3–5), 75–174 (2010)

    Article  MathSciNet  Google Scholar 

  11. Georgakopoulos, A., Haslegrave, J., Sauerwald, T., Sylvester, J.: The power of two choices for random walks. arXiv preprint arXiv:1911.05170 (2019)

  12. Hui, P., Yoneki, E., Chan, S.Y., Crowcroft, J.: Distributed community detection in delay tolerant networks. In: Proceedings of 2nd ACM/IEEE International Workshop on Mobility in the Evolving Internet Architecture. Association for Computing Machinery (2007). https://doi.org/10.1145/1366919.1366929

  13. Laenen, S., Sun, H.: Higher-order spectral clustering of directed graphs. In: 33rd Advances in Neural Information Processing Systems (NeurIPS 2020) (2020)

    Google Scholar 

  14. Lee, J.R., Gharan, S.O., Trevisan, L.: Multiway spectral partitioning and higher-order Cheeger inequalities. J. ACM 61(6), 37:1–37:30 (2014). https://doi.org/10.1145/2665063

  15. Li, A., Peng, P.: Community structures in classical network models. Internet Math. 7(2), 81–106 (2011). https://doi.org/10.1080/15427951.2011.566458

    Article  MathSciNet  MATH  Google Scholar 

  16. von Luxburg, U.: A tutorial on spectral clustering. Stat. Comput. 17(4), 395–416 (2007). https://doi.org/10.1007/s11222-007-9033-z

    Article  MathSciNet  Google Scholar 

  17. Ng, A.Y., Jordan, M.I., Weiss, Y.: On spectral clustering: analysis and an algorithm. In: 14th Advances in Neural Information Processing Systems (NeurIPS 2021), pp. 849–856. MIT Press (2001)

    Google Scholar 

  18. Oveis Gharan, S., Trevisan, L.: Partitioning into expanders. In: 25th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA 2014), pp. 1256–1266. SIAM (2014). https://doi.org/10.1137/1.9781611973402.93

  19. Peng, R., Sun, H., Zanetti, L.: Partitioning well-clustered graphs: spectral clustering works!. SIAM J. Comput. 46(2), 710–743 (2017). https://doi.org/10.1137/15M1047209

    Article  MathSciNet  MATH  Google Scholar 

  20. Sauerwald, T., Zanetti, L.: Random walks on dynamic graphs: mixing times, hitting times, and return probabilities. In: 46th International Colloquium on Automata, Languages, and Programming (ICALP 2019). LIPIcs, vol. 132, pp. 93:1–93:15. Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2019). https://doi.org/10.4230/LIPIcs.ICALP.2019.93

  21. Schaeffer, S.E.: Graph clustering. Comput. Sci. Rev. 1(1), 27–64 (2007). https://doi.org/10.1016/j.cosrev.2007.05.001

    Article  MATH  Google Scholar 

  22. Shi, J., Malik, J.: Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 22(8), 888–905 (2000). https://doi.org/10.1109/34.868688

    Article  Google Scholar 

  23. Sun, H., Zanetti, L.: Distributed graph clustering and sparsification. ACM Trans. Parallel Comput. 6(3), 17:1–17:23 (2019). https://doi.org/10.1145/3364208

  24. Yang, W., Xu, H.: A divide and conquer framework for distributed graph clustering. In: 32nd International Conference on Machine Learning (ICML 2015). JMLR Workshop and Conference Proceedings, vol. 37, pp. 504–513 (2015). JMLR.org

Download references

Acknowledgements

I would like to thank my supervisor Dr. He Sun for helpful discussion and comments on improving the presentation of this paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bogdan-Adrian Manghiuc .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Manghiuc, BA. (2021). Distributed Detection of Clusters of Arbitrary Size. In: Jurdziński, T., Schmid, S. (eds) Structural Information and Communication Complexity. SIROCCO 2021. Lecture Notes in Computer Science(), vol 12810. Springer, Cham. https://doi.org/10.1007/978-3-030-79527-6_21

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-79527-6_21

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-79526-9

  • Online ISBN: 978-3-030-79527-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics