Skip to main content
Log in

Locating communities on graphs with variations in community sizes

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

A fundamental problem in networking and computing is community detection. Various applications like finding web communities, uncovering the structure of social networks, or even analyzing a graph’s structure to uncover Internet attacks are just some of the applications for which community detection is important. In this paper, we propose an algorithm that finds the entire community structure of a network, represented by an undirected, unweighted graph, based on local interactions between neighboring nodes and on an unsupervised centralized clustering algorithm. The novelty of the proposed approach is the fact that the algorithm is based on the use of network coordinates computed by a distributed algorithm. Experimental results and comparisons with the Lancichinetti et al. method (Phys. Rev. E 80(5 Pt 2), 056117, 2009; New J. Phys. 11(3), 033015, 2009) are presented for a variety of benchmark graphs with known community structure, derived by varying a number of graph parameters. Emphasis is given on benchmark graphs with significant variations in the size of their communities. Further experimental results are presented for two real dataset graphs, namely the Enron, and the Epinions graphs, from SNAP, the Stanford Large Network Dataset Collection. The experimental results demonstrate the high performance of our algorithm in terms of accuracy to detect communities, and its computational efficiency.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Algorithm 1
Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  1. Bagrow JP, Bollt EM (2005) Local method for detecting communities. Phys Rev E 72(4):46–108

    Article  Google Scholar 

  2. Buter B, Dijkshoorn N, Modolo D, Nguyen Q, van Noort S, van de Poel B, Ali A, Salah A (2011) Explorative visualization and analysis of a social network for arts: The case of deviantart. J Converg 2(1):87–94

    Google Scholar 

  3. Dabek F, Cox R, Kaashoek F, Morris R (2004) Vivaldi: A decentralized network coordinate system. In: Proceedings of the ACM SIGCOMM’04 conference, August 2004

    Google Scholar 

  4. Datta S, Giannella CR, Kargupta H (2006) K-means clustering over a large, dynamic network. In: Proc SIAM int’l conf data mining, pp 153–164

    Google Scholar 

  5. Datta S, Giannella CR, Kargupta H (2009) Approximate distributed k-means clustering over a peer-to-peer network. IEEE Trans Knowl Data Eng 21:1372–1388

    Article  Google Scholar 

  6. Derényi I, Palla G, Vicsek T (2005) Clique percolation in random networks. Phys Rev Lett 94(16):160–202

    Article  Google Scholar 

  7. Dutta A, Ghosh I, Mukhopadhyay D (2009) An advanced partitioning approach of web page clustering utilizing content & link structure. J Converg Inf Technol 4(3):65–71

    Article  Google Scholar 

  8. Flake GW, Lawrence S, Giles CL, Coetzee FM (2002) Self-organization and identification of web communities. IEEE Comput 35:66–71

    Article  Google Scholar 

  9. Girvan M, Newman MEJ (2002) Community structure in social and biological networks. Proc Natl Acad Sci USA 99(12):7821–7826

    Article  MathSciNet  MATH  Google Scholar 

  10. Harris Papadakis PF, Panagiotakis C (2011) Distributed community detection: Finding neighborhoods in a complex world using synthetic coordinates. In: IEEE symposium on computers and communications, pp 1145–1150

    Google Scholar 

  11. Jo T (2008) Inverted index based modified version of k-means algorithm for text clustering. J Inf Process Syst 4(2)

  12. Katsaros D, Pallis G, Stamos K, Vakali A, Sidiropoulos A, Manolopoulos Y (2009) Cdns content outsourcing via generalized communities. IEEE Trans Knowl Data Eng 21:137–151

    Article  Google Scholar 

  13. Katsavounidis I, Kuo C-CJ, Zhang Z (1994) A new initialization technique for generalized Lloyd iteration. IEEE Signal Process Lett 1(10):144–146

    Article  Google Scholar 

  14. Lancichinetti A, Fortunato S (2009) Community detection algorithms: A comparative analysis. Phys Rev E 80(5 Pt 2):056117

    Article  Google Scholar 

  15. Lancichinetti A, Fortunato S, Kertész J (2009) Detecting the overlapping and hierarchical community structure in complex networks. New J Phys 11(3):033015

    Article  Google Scholar 

  16. Lancichinetti A, Fortunato S (2009) Community detection algorithms: A comparative analysis. Phys Rev E 80(5):056117

    Article  Google Scholar 

  17. Liu Y, Li W, Li Y-C (2007) Network traffic classification using k-means clustering. In: International multi-symposiums on computer and computational sciences, pp 360–365

    Google Scholar 

  18. MacQueen JB (1967) Some methods for classification and analysis of multivariate observations. In: Proc of the fifth Berkeley symposium on mathematical statistics and probability, vol 1, pp 281–297

    Google Scholar 

  19. Newman MEJ, Girvan M (2004) Finding and evaluating community structure in networks. Phys Rev E 69(2):026113

    Article  Google Scholar 

  20. Palla G, Derenyi I, Farkas I, Vicsek T (2005) Uncovering the overlapping community structure of complex networks in nature and society

  21. Papadopoulos S, Skusa A, Vakali A, Kompatsiaris Y, Wagner N (2009) Bridge bounding: A local approach for efficient community discovery in complex networks. Technical Report, arXiv:0902.0871, February 2009

  22. Ray S, Turi R (1999) Determination of number of clusters in k-means clustering and application in colour image segmentation. In: Proceedings of the 4th international conference on advances in pattern recognition and digital techniques, pp 137–143

    Google Scholar 

  23. Schaeffer SE (2007) Graph clustering. Comput Sci Rev 1(1):27–64

    Article  MathSciNet  Google Scholar 

  24. SNAP Stanford large network dataset collection. http://snap.stanford.edu

  25. Van Dongen S (2008) Graph clustering via a discrete uncoupling process. SIAM J Matrix Anal Appl 30:121–141

    Article  MathSciNet  MATH  Google Scholar 

  26. Wang S, Tsai Y, Shen C, Chen P (2010) Hierarchical key derivation scheme for group-oriented communication systems. Int J Inf Technol Commun Converg 1(1):66–76

    Google Scholar 

  27. Wu F, Huberman BA (2004) Finding communities in linear time: A physics approach. Eur Phys J, B Cond Matter Complex Syst 38(2):331–338

    Article  Google Scholar 

  28. Ye Y, Li X, Wu B, Li Y (2011) A comparative study of feature weighting methods for document co-clustering. Int J Inf Technol Commun Converg 1(2):206–220

    Google Scholar 

  29. Yu F, Oyana D, Hou W, Wainer M (2010) Approximate clustering on data streams using discrete cosine transform. J Inf Process Syst 6(1):67–78

    MATH  Google Scholar 

Download references

Acknowledgements

This project is implemented through the Operational Program “ARCHIMEDE III: Education and Lifelong Learning” (project P2PCOORD) and is co-financed by the European Union (European Social Fund) and Greek national funds (National Strategic Reference Framework 2007–2013).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Costas Panagiotakis.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Papadakis, H., Panagiotakis, C. & Fragopoulou, P. Locating communities on graphs with variations in community sizes. J Supercomput 65, 543–561 (2013). https://doi.org/10.1007/s11227-012-0806-6

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-012-0806-6

Keywords