Abstract
A fundamental problem in networking and computing is community detection. Various applications like finding web communities, uncovering the structure of social networks, or even analyzing a graph’s structure to uncover Internet attacks are just some of the applications for which community detection is important. In this paper, we propose an algorithm that finds the entire community structure of a network, represented by an undirected, unweighted graph, based on local interactions between neighboring nodes and on an unsupervised centralized clustering algorithm. The novelty of the proposed approach is the fact that the algorithm is based on the use of network coordinates computed by a distributed algorithm. Experimental results and comparisons with the Lancichinetti et al. method (Phys. Rev. E 80(5 Pt 2), 056117, 2009; New J. Phys. 11(3), 033015, 2009) are presented for a variety of benchmark graphs with known community structure, derived by varying a number of graph parameters. Emphasis is given on benchmark graphs with significant variations in the size of their communities. Further experimental results are presented for two real dataset graphs, namely the Enron, and the Epinions graphs, from SNAP, the Stanford Large Network Dataset Collection. The experimental results demonstrate the high performance of our algorithm in terms of accuracy to detect communities, and its computational efficiency.









Similar content being viewed by others
References
Bagrow JP, Bollt EM (2005) Local method for detecting communities. Phys Rev E 72(4):46–108
Buter B, Dijkshoorn N, Modolo D, Nguyen Q, van Noort S, van de Poel B, Ali A, Salah A (2011) Explorative visualization and analysis of a social network for arts: The case of deviantart. J Converg 2(1):87–94
Dabek F, Cox R, Kaashoek F, Morris R (2004) Vivaldi: A decentralized network coordinate system. In: Proceedings of the ACM SIGCOMM’04 conference, August 2004
Datta S, Giannella CR, Kargupta H (2006) K-means clustering over a large, dynamic network. In: Proc SIAM int’l conf data mining, pp 153–164
Datta S, Giannella CR, Kargupta H (2009) Approximate distributed k-means clustering over a peer-to-peer network. IEEE Trans Knowl Data Eng 21:1372–1388
Derényi I, Palla G, Vicsek T (2005) Clique percolation in random networks. Phys Rev Lett 94(16):160–202
Dutta A, Ghosh I, Mukhopadhyay D (2009) An advanced partitioning approach of web page clustering utilizing content & link structure. J Converg Inf Technol 4(3):65–71
Flake GW, Lawrence S, Giles CL, Coetzee FM (2002) Self-organization and identification of web communities. IEEE Comput 35:66–71
Girvan M, Newman MEJ (2002) Community structure in social and biological networks. Proc Natl Acad Sci USA 99(12):7821–7826
Harris Papadakis PF, Panagiotakis C (2011) Distributed community detection: Finding neighborhoods in a complex world using synthetic coordinates. In: IEEE symposium on computers and communications, pp 1145–1150
Jo T (2008) Inverted index based modified version of k-means algorithm for text clustering. J Inf Process Syst 4(2)
Katsaros D, Pallis G, Stamos K, Vakali A, Sidiropoulos A, Manolopoulos Y (2009) Cdns content outsourcing via generalized communities. IEEE Trans Knowl Data Eng 21:137–151
Katsavounidis I, Kuo C-CJ, Zhang Z (1994) A new initialization technique for generalized Lloyd iteration. IEEE Signal Process Lett 1(10):144–146
Lancichinetti A, Fortunato S (2009) Community detection algorithms: A comparative analysis. Phys Rev E 80(5 Pt 2):056117
Lancichinetti A, Fortunato S, Kertész J (2009) Detecting the overlapping and hierarchical community structure in complex networks. New J Phys 11(3):033015
Lancichinetti A, Fortunato S (2009) Community detection algorithms: A comparative analysis. Phys Rev E 80(5):056117
Liu Y, Li W, Li Y-C (2007) Network traffic classification using k-means clustering. In: International multi-symposiums on computer and computational sciences, pp 360–365
MacQueen JB (1967) Some methods for classification and analysis of multivariate observations. In: Proc of the fifth Berkeley symposium on mathematical statistics and probability, vol 1, pp 281–297
Newman MEJ, Girvan M (2004) Finding and evaluating community structure in networks. Phys Rev E 69(2):026113
Palla G, Derenyi I, Farkas I, Vicsek T (2005) Uncovering the overlapping community structure of complex networks in nature and society
Papadopoulos S, Skusa A, Vakali A, Kompatsiaris Y, Wagner N (2009) Bridge bounding: A local approach for efficient community discovery in complex networks. Technical Report, arXiv:0902.0871, February 2009
Ray S, Turi R (1999) Determination of number of clusters in k-means clustering and application in colour image segmentation. In: Proceedings of the 4th international conference on advances in pattern recognition and digital techniques, pp 137–143
Schaeffer SE (2007) Graph clustering. Comput Sci Rev 1(1):27–64
SNAP Stanford large network dataset collection. http://snap.stanford.edu
Van Dongen S (2008) Graph clustering via a discrete uncoupling process. SIAM J Matrix Anal Appl 30:121–141
Wang S, Tsai Y, Shen C, Chen P (2010) Hierarchical key derivation scheme for group-oriented communication systems. Int J Inf Technol Commun Converg 1(1):66–76
Wu F, Huberman BA (2004) Finding communities in linear time: A physics approach. Eur Phys J, B Cond Matter Complex Syst 38(2):331–338
Ye Y, Li X, Wu B, Li Y (2011) A comparative study of feature weighting methods for document co-clustering. Int J Inf Technol Commun Converg 1(2):206–220
Yu F, Oyana D, Hou W, Wainer M (2010) Approximate clustering on data streams using discrete cosine transform. J Inf Process Syst 6(1):67–78
Acknowledgements
This project is implemented through the Operational Program “ARCHIMEDE III: Education and Lifelong Learning” (project P2PCOORD) and is co-financed by the European Union (European Social Fund) and Greek national funds (National Strategic Reference Framework 2007–2013).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Papadakis, H., Panagiotakis, C. & Fragopoulou, P. Locating communities on graphs with variations in community sizes. J Supercomput 65, 543–561 (2013). https://doi.org/10.1007/s11227-012-0806-6
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-012-0806-6