Abstract
One of the most important features of real networks is the presence of community structures or the subset of nodes that are densely connected to each other when compared to the rest of the networks, which encode the information about the organization and functionality of the nodes. Social networking sites (SNS), which allow the interaction of millions of users, have important scientific and practical implications; however, they require the development of fast algorithms. We focus on the algorithm developed by Clauset, Newman, and Moore (CNM) and its widely used modifications to analyze the behavior and effectiveness in terms of speed. This chapter describes the inefficiencies of CNM and shows that the determinant factor that impacts the speed is the number of interconnected communities (NIC) that represent the number of operations performed when merging two communities. We propose a new improvement of CNM that considers the NIC and a new implementation framework to accelerate CNM. Our improvements were compared with the former CNM and its variations when applied to large-scale networks from seven real data sets (Mixi, Facebook, Flickr, LiveJournal, Orkut, YouTube, and Delicious) and five synthetic networks with different structural properties. The experimental results demonstrate that the performance of all algorithms is impacted by the structural properties of the network and our proposed improvements outperform former algorithms in terms of speed and modularity in most network structures, thereby showing its applicability to real large-scale networks.
Leon-Suematsu and yuta are contributed equally to this work
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Ahn, Y., Han, S., Kwak, H., Moon, S., and Jeong, H. Analysis of topological characteristics of huge online social networking services. In Proceedings of the 16th International Conference on World Wide Web, Banff, Alberta, Canada, pp. 835–844, 2007.
Albert, R. and Barab´asi, A.-L. Statistical mechanics of complex networks. Review of Modern Physics, 74(1):47–97, Jan 2002.
Blondel, V.D., Guillaume, J.-L., Lambiotte, R. and Lefebvre, E. Fast unfolding of communities in large networks, Journal of Statistical Mechanics: Theory and Experiment, 10:P10008, 2008.
Brandes, U., Delling, D., Gaertler, M. Goerke, R., Hoefer, M., Nikoloski, Z., and Wagner, D. Maximizing modularity is hard. arXiv: physics/0608255, 2006.
Capocci, A., Servedio, V.D.P., Caldarelli, G., and Colaiori, F. Detecting communities in large networks. arXiv:cond-mat/0402499v2, 2004.
Clauset, A., Newman, M.E.J., and Moore, C. Finding community structure in very large networks. Physical Review E, 70:066111, 2004.
Danon, L., Diaz-Guilera, A., and Arenas, A. Effect of size heterogeneity on community identification in complex networks. arXiv:physics/0601144, 2006.
Danon, L., Duch, J., Diaz-Guilera, A., and Arenas, A. Comparing community structure identification. Journal of Statistical Mechanics: Theory and Experiment, 09:P09008, 2005.
Derenyi, I., Palla, G., and Vicsek, T. Clique percolation in random networks. Physical Review Letters, 94:160202, 2005.
Duch, J. and Arenas, A. Community detection in complex networks using extremal optimization. Physical Review E, 72:027104, 2005.
Fortunato, S. and Barthelemy, M. Resolution limit in community detection. Proceedings of the National Academy of Sciences of the United States of America, 104:36, 2007.
Girvan, M. and Newman, M.E.J. Community structure in social and biological networks. Proceedings of the National Academy of Sciences of the United States of America, 99:7821, 2002.
Guimera, R. and Nunes Amaral, L.A. Functional cartography of complex metabolic networks. Nature, 433(895), 2005.
Guimerà, R., Sales-Pardo, M., and Amaral, L.A.N. Modularity from fluctuations in random graphs and complex networks. Physical Review E, 70(2):025101, 2004.
Kumar, R., Raghavan, P., Rajagopalan, S., and Tomkins, A. Trawling the web for emerging cyber-communities. Computer Networks, 31(11–16):1481–1493, 1999.
Lancichinetti, A., Fortunato, S., and Radicchi, F. Benchmark graphs for testing community detection algorithms. Physical Review E, 78:046110, 2008.
Leicht, E.A. and Newman, M.E.J. Community structure in directed networks. arXiv:0709.4500v1, 2007.
Leon-Suematsu, Y.I. and Yuta, K. A framework for fast community extraction of large-scale networks. In Proceeding of the 17th International Conference on World Wide Web, Beijing, China, pp. 1215–1216, 2008.
Massen, C.P. and Doye, J.P.K. Identifying “communities” within energy landscapes. Physical Review E, 71:046101, 2005.
Mislove, A., Marcon, M., Gummadi, K. P., Druschel, P., and Bhattacharjee, B. Measurement and analysis of online social networks. In Proceedings of the 7th ACM SIGCOMM Conference on Internet Measurement, San Diego, California, USA, pp. 29–42, 2007.
Newman, M.E.J. Analysis of weighted networks. Physical Review E, 70:056131, 2004.
Newman, M.E.J. Fast algorithm for detecting community structure in networks. Physical Review E, 69(6):066133, 2004.
Newman, M.E.J. Finding community structure in networks using the eigenvectors of matrices. Physical Review E, 74:036104, 2006.
Newman, M.E.J. Modularity and community structure in networks. Proceedings of the National Academy of Sciences of the United States of America, 103(23):8577–8582, 2006.
Newman, M.E.J. and Girvan, M. Finding and evaluating community structure in networks. Physical Review E, 69(2):026113, 2004.
Palla, G., Farkas, I.J., Pollner, P., Derenyi, I., and Vicsek, T. Directed network modules. New Journal of Physics, 9:186, 2007.
V´azquez, A. Growing networks with local rules: Preferential attachment, clustering hierarchy and degree correlations. Physical Review E, 67:056104, 2003.
Viswanath, B., Mislove, A., Cha, M., and Gummadi, K. P. On the evolution of user interaction in Facebook. In Proceedings of the 2nd ACM Workshop on online Social Networks, Barcelona, Spain, pp. 37–42, 2009.
Wakita, K. and Tsurumi, T. Finding community structure in mega-scale social networks. arXiv:cs/0702048, 2007.
Watts, D.J. and Strogatz, S.H. Collective dynamics of ‘small-world’ networks. Nature, 393(6684):440–442, June 1998.
Yuta, K., Ono, N., and Fujiwara, Y. A gap in the community-size distribution of a large-scale social networking site. arXiv:physics/0701168v2, 2007.
Acknowledgments
The authors are grateful to Yoshi Fujiwara from ATR for his ever-inspiring discussions and helpful comments on preliminary versions. We would like to thank the anonymous reviewers for their invaluable comments and for letting us know about the competing algorithm. We also thank Alan Mislove from the Max-Planck Institute for providing his data sets. Finally, we would like to thank Mixi, Inc., for providing the data set, in which users were all encrypted. The data set is handled under a Non-Disclosure Agreement. Our work does not evaluate the personality of participants or services in any SNSs. We declare no competing interests.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer US
About this chapter
Cite this chapter
Leon-Suematsu, Y.I., Yuta, K. (2010). Framework for Fast Identification of Community Structures in Large-Scale Social Networks. In: Memon, N., Xu, J., Hicks, D., Chen, H. (eds) Data Mining for Social Network Data. Annals of Information Systems, vol 12. Springer, Boston, MA. https://doi.org/10.1007/978-1-4419-6287-4_9
Download citation
DOI: https://doi.org/10.1007/978-1-4419-6287-4_9
Published:
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4419-6286-7
Online ISBN: 978-1-4419-6287-4
eBook Packages: Business and EconomicsBusiness and Management (R0)