Skip to main content

Perspective on Measurement Metrics for Community Detection Algorithms

  • Chapter

Part of the book series: Lecture Notes in Social Networks ((LNSN))

Abstract

Community detection or cluster detection in networks is often at the core of mining network data. Whereas the problem is well-studied, given the scale and complexity of modern day social networks, detecting “reasonable” communities is often a hard problem. Since the first use of k-means algorithm in 1960s, many community detection algorithms have been presented—most of which are developed with specific goals in mind and the idea of detecting meaningful communities varies widely from one algorithm to another.

As the number of clustering algorithms grows, so does the number of metrics on how to measure them. Algorithms are often reduced to optimizing the value of an objective function such as modularity and internal density. Some of these metrics rely on ground-truth, some do not. In this chapter we study these algorithms and aim to find whether these optimization based measurements are consistent with the real performance of community detection algorithm. Seven representative algorithms are compared under various performance metrics, and on various real world social networks.

The difficulties of measuring community detection algorithms are mostly due to the unavailability of ground-truth information, and then objective functions, such as modularity, are used as substitutes. The benchmark networks that simulate real world networks with planted community structure are introduced to tackle the unavailability of ground-truth information, however whether the simulation is precise and useful has not been verified. In this chapter we present the performance of community detection algorithms on real world networks and their corresponding benchmark networks, which are designed to demonstrate the differences between real world networks and benchmark networks.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Ahn Y, Bagrow JP, Lehmann S (2010) Link communities reveal multiscale complexity in networks. arXiv:0903.3178v3 [physics.soc-ph]

  2. Chen J, Zaïane OR, Goebel R (2009) Detecting communities in social networks using max-min modularity. In: International conference on data mining (SDM 09)

    Google Scholar 

  3. de Nooy W, Mrvar A, Batagelj V (2004) Exploratory social network analysis with Pajek, Chapter 12. Cambridge University Press, Cambridge

    Google Scholar 

  4. Dhillon I, Guan Y, Kulis B (2005) A fast kernel-based multilevel algorithm for graph clustering. In: Proceedings of the 11th ACM SIGKDD, Chicago, IL, August 21–24

    Google Scholar 

  5. Eagle N, Pentland A (2006) Reality mining: sensing complex social systems. Pers Ubiquitous Comput 10(4):255–268

    Article  Google Scholar 

  6. Evans TS, Lambiotte R (2009) Line graphs, link partitions, and overlapping communities. Phys Rev E 80(1):016105

    Article  ADS  Google Scholar 

  7. Gil-Mendieta J, Schmidt S (1996) The political network in Mexico. Soc Netw 18(4): 355–381

    Article  Google Scholar 

  8. Girvan M, Newman MEJ (2002) Community structure in social and biological networks. Proc Natl Acad Sci USA 99(12):7821–7826

    Article  MathSciNet  ADS  MATH  Google Scholar 

  9. Jiang P, Singh M (2010) SPICi: a fast clustering algorithm for large biological networks. Bioinformatics 26(8):1105–1111

    Article  Google Scholar 

  10. Lancichinetti A, Fortunato S, Kertész J (2009) Detecting the overlapping and hierarchical community structure in complex networks. New J Phys 11(3):033015

    Article  Google Scholar 

  11. Leskovec J, Lang KJ, Mahoney MW (2010) Empirical comparison of algorithms for network community detection. In: WWW 2010, April 26–30, Raleigh, North Carolina, USA

    Google Scholar 

  12. Michael JH, Massey JG (1997) Modeling the communication network in a sawmill. For Prod J 47:25–30

    Google Scholar 

  13. Mislove A (2009) Online social networks: measurement, analysis, and applications to distributed information systems. Ph.D Thesis, Rice University, Department of Computer Science

    Google Scholar 

  14. Pandit S, Kawadia V, Yang Y, Chawla NV, Sreenivasan S (2011) Detecting communities in time-evolving proximity networks. In: IEEE first international workshop on network science (submitted)

    Google Scholar 

  15. Peel L (2010) Estimating network parameters for selecting community detection algorithms. In: 13th international conference on information fusion

    Google Scholar 

  16. Pons P, Latapy M (2006) Computing communities in large networks using random walks. J Graph Algorithms Appl 10(2):191–218

    Article  MathSciNet  MATH  Google Scholar 

  17. Radicchi F, Castellano C, Cecconi F, Loreto V, Parisi D (2004) Defining and identifying communities in networks. Proc Natl Acad Sci USA 101(9):2658–2663

    Article  ADS  Google Scholar 

  18. Shi J, Malik J (2000) Normalized cuts and image segmentation. IEEE Trans Pattern Anal Mach Intell 22(8):888–905

    Article  Google Scholar 

  19. Steinhaeuser K, Chawla NV (2010) Identifying and evaluating community structure in complex networks. Pattern Recognit Lett 31(5):413–421

    Article  Google Scholar 

  20. Steinhaeuser K, Chawla NV Is modularity the answer to evaluating community structure in networks? In: International conference on network science (NetSci), Norwich, UK

    Google Scholar 

  21. Sun Y, Han J, Zhao P, Yin Z, Cheng H, Wu T RankClus: integrating clustering with ranking for heterogeneous information network analysis. In: EDBT 2009, March 24–26, 2009, Saint Petersburg, Russia

    Google Scholar 

  22. Sun Y, Han J (2010) Integrating clustering and ranking for heterogeneous information network analysis. In: Yu PS, Han J, Faloutsos C (eds) Link mining: models, algorithms and applications. Springer, New York, pp 439–474

    Chapter  Google Scholar 

  23. Tang L, Liu H (2009) Scalable learning of collective behavior based on sparse social dimensions. In: Proceedings of the 18th ACM conference on information and knowledge management (CIKM’09)

    Google Scholar 

  24. World Cities and Global Firms dataset was created by Taylor PJ, Walker DRF as part of their project “World city network: data matrix construction and analysis” and is based on primary data collected by Beaverstock JV, Smith RG, Taylor PJ (ESRC project “The geographical scope of London as a world city” (R000222050))

    Google Scholar 

  25. Zachary WW (1977) An information flow model for conflict and fission in small groups. J Anthropol Res 33:452–473

    Google Scholar 

Download references

Acknowledgements

This research was sponsored by the Army Research Laboratory and was accomplished under Cooperative Agreement Number W911NF-09-2-0053. The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of the Army Research Laboratory or the U.S. Government. The U.S. Government is authorized to reproduce and distribute reprints for Government purposes notwithstanding any copyright notation here on.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yang Yang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer Science+Business Media Dordrecht

About this chapter

Cite this chapter

Yang, Y., Sun, Y., Pandit, S., Chawla, N.V., Han, J. (2013). Perspective on Measurement Metrics for Community Detection Algorithms. In: Özyer, T., Erdem, Z., Rokne, J., Khoury, S. (eds) Mining Social Networks and Security Informatics. Lecture Notes in Social Networks. Springer, Dordrecht. https://doi.org/10.1007/978-94-007-6359-3_12

Download citation

  • DOI: https://doi.org/10.1007/978-94-007-6359-3_12

  • Publisher Name: Springer, Dordrecht

  • Print ISBN: 978-94-007-6358-6

  • Online ISBN: 978-94-007-6359-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics