Skip to main content

Framework for Fast Identification of Community Structures in Large-Scale Social Networks

  • Chapter
  • First Online:
Data Mining for Social Network Data

Part of the book series: Annals of Information Systems ((AOIS,volume 12))

Abstract

One of the most important features of real networks is the presence of community structures or the subset of nodes that are densely connected to each other when compared to the rest of the networks, which encode the information about the organization and functionality of the nodes. Social networking sites (SNS), which allow the interaction of millions of users, have important scientific and practical implications; however, they require the development of fast algorithms. We focus on the algorithm developed by Clauset, Newman, and Moore (CNM) and its widely used modifications to analyze the behavior and effectiveness in terms of speed. This chapter describes the inefficiencies of CNM and shows that the determinant factor that impacts the speed is the number of interconnected communities (NIC) that represent the number of operations performed when merging two communities. We propose a new improvement of CNM that considers the NIC and a new implementation framework to accelerate CNM. Our improvements were compared with the former CNM and its variations when applied to large-scale networks from seven real data sets (Mixi, Facebook, Flickr, LiveJournal, Orkut, YouTube, and Delicious) and five synthetic networks with different structural properties. The experimental results demonstrate that the performance of all algorithms is impacted by the structural properties of the network and our proposed improvements outperform former algorithms in terms of speed and modularity in most network structures, thereby showing its applicability to real large-scale networks.

Leon-Suematsu and yuta are contributed equally to this work

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://cs.unm.edu/aaron/blog/archives/2007/02/fastmodularity.htm

  2. 2.

    http://www.cs.unm.edu/aaron/research/fastmodularity.htm

  3. 3.

    http://findcommunities.googlepages.com/

References

  1. Ahn, Y., Han, S., Kwak, H., Moon, S., and Jeong, H. Analysis of topological characteristics of huge online social networking services. In Proceedings of the 16th International Conference on World Wide Web, Banff, Alberta, Canada, pp. 835–844, 2007.

    Google Scholar 

  2. Albert, R. and Barab´asi, A.-L. Statistical mechanics of complex networks. Review of Modern Physics, 74(1):47–97, Jan 2002.

    Article  Google Scholar 

  3. Blondel, V.D., Guillaume, J.-L., Lambiotte, R. and Lefebvre, E. Fast unfolding of communities in large networks, Journal of Statistical Mechanics: Theory and Experiment, 10:P10008, 2008.

    Google Scholar 

  4. Brandes, U., Delling, D., Gaertler, M. Goerke, R., Hoefer, M., Nikoloski, Z., and Wagner, D. Maximizing modularity is hard. arXiv: physics/0608255, 2006.

    Google Scholar 

  5. Capocci, A., Servedio, V.D.P., Caldarelli, G., and Colaiori, F. Detecting communities in large networks. arXiv:cond-mat/0402499v2, 2004.

    Google Scholar 

  6. Clauset, A., Newman, M.E.J., and Moore, C. Finding community structure in very large networks. Physical Review E, 70:066111, 2004.

    Article  Google Scholar 

  7. Danon, L., Diaz-Guilera, A., and Arenas, A. Effect of size heterogeneity on community identification in complex networks. arXiv:physics/0601144, 2006.

    Google Scholar 

  8. Danon, L., Duch, J., Diaz-Guilera, A., and Arenas, A. Comparing community structure identification. Journal of Statistical Mechanics: Theory and Experiment, 09:P09008, 2005.

    Google Scholar 

  9. Derenyi, I., Palla, G., and Vicsek, T. Clique percolation in random networks. Physical Review Letters, 94:160202, 2005.

    Article  Google Scholar 

  10. Duch, J. and Arenas, A. Community detection in complex networks using extremal optimization. Physical Review E, 72:027104, 2005.

    Article  Google Scholar 

  11. Fortunato, S. and Barthelemy, M. Resolution limit in community detection. Proceedings of the National Academy of Sciences of the United States of America, 104:36, 2007.

    Article  Google Scholar 

  12. Girvan, M. and Newman, M.E.J. Community structure in social and biological networks. Proceedings of the National Academy of Sciences of the United States of America, 99:7821, 2002.

    Article  Google Scholar 

  13. Guimera, R. and Nunes Amaral, L.A. Functional cartography of complex metabolic networks. Nature, 433(895), 2005.

    Google Scholar 

  14. Guimerà, R., Sales-Pardo, M., and Amaral, L.A.N. Modularity from fluctuations in random graphs and complex networks. Physical Review E, 70(2):025101, 2004.

    Article  Google Scholar 

  15. Kumar, R., Raghavan, P., Rajagopalan, S., and Tomkins, A. Trawling the web for emerging cyber-communities. Computer Networks, 31(11–16):1481–1493, 1999.

    Article  Google Scholar 

  16. Lancichinetti, A., Fortunato, S., and Radicchi, F. Benchmark graphs for testing community detection algorithms. Physical Review E, 78:046110, 2008.

    Article  Google Scholar 

  17. Leicht, E.A. and Newman, M.E.J. Community structure in directed networks. arXiv:0709.4500v1, 2007.

    Google Scholar 

  18. Leon-Suematsu, Y.I. and Yuta, K. A framework for fast community extraction of large-scale networks. In Proceeding of the 17th International Conference on World Wide Web, Beijing, China, pp. 1215–1216, 2008.

    Google Scholar 

  19. Massen, C.P. and Doye, J.P.K. Identifying “communities” within energy landscapes. Physical Review E, 71:046101, 2005.

    Article  Google Scholar 

  20. Mislove, A., Marcon, M., Gummadi, K. P., Druschel, P., and Bhattacharjee, B. Measurement and analysis of online social networks. In Proceedings of the 7th ACM SIGCOMM Conference on Internet Measurement, San Diego, California, USA, pp. 29–42, 2007.

    Google Scholar 

  21. Newman, M.E.J. Analysis of weighted networks. Physical Review E, 70:056131, 2004.

    Article  Google Scholar 

  22. Newman, M.E.J. Fast algorithm for detecting community structure in networks. Physical Review E, 69(6):066133, 2004.

    Article  Google Scholar 

  23. Newman, M.E.J. Finding community structure in networks using the eigenvectors of matrices. Physical Review E, 74:036104, 2006.

    Article  Google Scholar 

  24. Newman, M.E.J. Modularity and community structure in networks. Proceedings of the National Academy of Sciences of the United States of America, 103(23):8577–8582, 2006.

    Article  Google Scholar 

  25. Newman, M.E.J. and Girvan, M. Finding and evaluating community structure in networks. Physical Review E, 69(2):026113, 2004.

    Article  Google Scholar 

  26. Palla, G., Farkas, I.J., Pollner, P., Derenyi, I., and Vicsek, T. Directed network modules. New Journal of Physics, 9:186, 2007.

    Article  Google Scholar 

  27. V´azquez, A. Growing networks with local rules: Preferential attachment, clustering hierarchy and degree correlations. Physical Review E, 67:056104, 2003.

    Article  Google Scholar 

  28. Viswanath, B., Mislove, A., Cha, M., and Gummadi, K. P. On the evolution of user interaction in Facebook. In Proceedings of the 2nd ACM Workshop on online Social Networks, Barcelona, Spain, pp. 37–42, 2009.

    Google Scholar 

  29. Wakita, K. and Tsurumi, T. Finding community structure in mega-scale social networks. arXiv:cs/0702048, 2007.

    Google Scholar 

  30. Watts, D.J. and Strogatz, S.H. Collective dynamics of ‘small-world’ networks. Nature, 393(6684):440–442, June 1998.

    Article  Google Scholar 

  31. Yuta, K., Ono, N., and Fujiwara, Y. A gap in the community-size distribution of a large-scale social networking site. arXiv:physics/0701168v2, 2007.

    Google Scholar 

Download references

Acknowledgments

The authors are grateful to Yoshi Fujiwara from ATR for his ever-inspiring discussions and helpful comments on preliminary versions. We would like to thank the anonymous reviewers for their invaluable comments and for letting us know about the competing algorithm. We also thank Alan Mislove from the Max-Planck Institute for providing his data sets. Finally, we would like to thank Mixi, Inc., for providing the data set, in which users were all encrypted. The data set is handled under a Non-Disclosure Agreement. Our work does not evaluate the personality of participants or services in any SNSs. We declare no competing interests.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yutaka I. Leon-Suematsu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer US

About this chapter

Cite this chapter

Leon-Suematsu, Y.I., Yuta, K. (2010). Framework for Fast Identification of Community Structures in Large-Scale Social Networks. In: Memon, N., Xu, J., Hicks, D., Chen, H. (eds) Data Mining for Social Network Data. Annals of Information Systems, vol 12. Springer, Boston, MA. https://doi.org/10.1007/978-1-4419-6287-4_9

Download citation

Publish with us

Policies and ethics