Perspective on Measurement Metrics for Community Detection Algorithms

Yang, Yang; Sun, Yizhou; Pandit, Saurav; Chawla, Nitesh V.; Han, Jiawei

doi:10.1007/978-94-007-6359-3_12

Perspective on Measurement Metrics for Community Detection Algorithms

Yang Yang⁵,
Yizhou Sun⁶,
Saurav Pandit⁵,
Nitesh V. Chawla⁵ &
…
Jiawei Han⁶

Chapter

2785 Accesses
1 Citations

Part of the book series: Lecture Notes in Social Networks ((LNSN))

Abstract

Community detection or cluster detection in networks is often at the core of mining network data. Whereas the problem is well-studied, given the scale and complexity of modern day social networks, detecting “reasonable” communities is often a hard problem. Since the first use of k-means algorithm in 1960s, many community detection algorithms have been presented—most of which are developed with specific goals in mind and the idea of detecting meaningful communities varies widely from one algorithm to another.

As the number of clustering algorithms grows, so does the number of metrics on how to measure them. Algorithms are often reduced to optimizing the value of an objective function such as modularity and internal density. Some of these metrics rely on ground-truth, some do not. In this chapter we study these algorithms and aim to find whether these optimization based measurements are consistent with the real performance of community detection algorithm. Seven representative algorithms are compared under various performance metrics, and on various real world social networks.

The difficulties of measuring community detection algorithms are mostly due to the unavailability of ground-truth information, and then objective functions, such as modularity, are used as substitutes. The benchmark networks that simulate real world networks with planted community structure are introduced to tackle the unavailability of ground-truth information, however whether the simulation is precise and useful has not been verified. In this chapter we present the performance of community detection algorithms on real world networks and their corresponding benchmark networks, which are designed to demonstrate the differences between real world networks and benchmark networks.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Ahn Y, Bagrow JP, Lehmann S (2010) Link communities reveal multiscale complexity in networks. arXiv:0903.3178v3 [physics.soc-ph]
Chen J, Zaïane OR, Goebel R (2009) Detecting communities in social networks using max-min modularity. In: International conference on data mining (SDM 09)
Google Scholar
de Nooy W, Mrvar A, Batagelj V (2004) Exploratory social network analysis with Pajek, Chapter 12. Cambridge University Press, Cambridge
Google Scholar
Dhillon I, Guan Y, Kulis B (2005) A fast kernel-based multilevel algorithm for graph clustering. In: Proceedings of the 11th ACM SIGKDD, Chicago, IL, August 21–24
Google Scholar
Eagle N, Pentland A (2006) Reality mining: sensing complex social systems. Pers Ubiquitous Comput 10(4):255–268
Article Google Scholar
Evans TS, Lambiotte R (2009) Line graphs, link partitions, and overlapping communities. Phys Rev E 80(1):016105
Article ADS Google Scholar
Gil-Mendieta J, Schmidt S (1996) The political network in Mexico. Soc Netw 18(4): 355–381
Article Google Scholar
Girvan M, Newman MEJ (2002) Community structure in social and biological networks. Proc Natl Acad Sci USA 99(12):7821–7826
Article MathSciNet ADS MATH Google Scholar
Jiang P, Singh M (2010) SPICi: a fast clustering algorithm for large biological networks. Bioinformatics 26(8):1105–1111
Article Google Scholar
Lancichinetti A, Fortunato S, Kertész J (2009) Detecting the overlapping and hierarchical community structure in complex networks. New J Phys 11(3):033015
Article Google Scholar
Leskovec J, Lang KJ, Mahoney MW (2010) Empirical comparison of algorithms for network community detection. In: WWW 2010, April 26–30, Raleigh, North Carolina, USA
Google Scholar
Michael JH, Massey JG (1997) Modeling the communication network in a sawmill. For Prod J 47:25–30
Google Scholar
Mislove A (2009) Online social networks: measurement, analysis, and applications to distributed information systems. Ph.D Thesis, Rice University, Department of Computer Science
Google Scholar
Pandit S, Kawadia V, Yang Y, Chawla NV, Sreenivasan S (2011) Detecting communities in time-evolving proximity networks. In: IEEE first international workshop on network science (submitted)
Google Scholar
Peel L (2010) Estimating network parameters for selecting community detection algorithms. In: 13th international conference on information fusion
Google Scholar
Pons P, Latapy M (2006) Computing communities in large networks using random walks. J Graph Algorithms Appl 10(2):191–218
Article MathSciNet MATH Google Scholar
Radicchi F, Castellano C, Cecconi F, Loreto V, Parisi D (2004) Defining and identifying communities in networks. Proc Natl Acad Sci USA 101(9):2658–2663
Article ADS Google Scholar
Shi J, Malik J (2000) Normalized cuts and image segmentation. IEEE Trans Pattern Anal Mach Intell 22(8):888–905
Article Google Scholar
Steinhaeuser K, Chawla NV (2010) Identifying and evaluating community structure in complex networks. Pattern Recognit Lett 31(5):413–421
Article Google Scholar
Steinhaeuser K, Chawla NV Is modularity the answer to evaluating community structure in networks? In: International conference on network science (NetSci), Norwich, UK
Google Scholar
Sun Y, Han J, Zhao P, Yin Z, Cheng H, Wu T RankClus: integrating clustering with ranking for heterogeneous information network analysis. In: EDBT 2009, March 24–26, 2009, Saint Petersburg, Russia
Google Scholar
Sun Y, Han J (2010) Integrating clustering and ranking for heterogeneous information network analysis. In: Yu PS, Han J, Faloutsos C (eds) Link mining: models, algorithms and applications. Springer, New York, pp 439–474
Chapter Google Scholar
Tang L, Liu H (2009) Scalable learning of collective behavior based on sparse social dimensions. In: Proceedings of the 18th ACM conference on information and knowledge management (CIKM’09)
Google Scholar
World Cities and Global Firms dataset was created by Taylor PJ, Walker DRF as part of their project “World city network: data matrix construction and analysis” and is based on primary data collected by Beaverstock JV, Smith RG, Taylor PJ (ESRC project “The geographical scope of London as a world city” (R000222050))
Google Scholar
Zachary WW (1977) An information flow model for conflict and fission in small groups. J Anthropol Res 33:452–473
Google Scholar

Download references

Acknowledgements

This research was sponsored by the Army Research Laboratory and was accomplished under Cooperative Agreement Number W911NF-09-2-0053. The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of the Army Research Laboratory or the U.S. Government. The U.S. Government is authorized to reproduce and distribute reprints for Government purposes notwithstanding any copyright notation here on.

Author information

Authors and Affiliations

Department of Computer Science & Engineering, University of Notre Dame, Notre Dame, IN, 46556, USA
Yang Yang, Saurav Pandit & Nitesh V. Chawla
Department of Computer Science, University of Illinois, Urbana and Champaign, Urbana, IL, 61801, USA
Yizhou Sun & Jiawei Han

Authors

Yang Yang
View author publications
You can also search for this author in PubMed Google Scholar
Yizhou Sun
View author publications
You can also search for this author in PubMed Google Scholar
Saurav Pandit
View author publications
You can also search for this author in PubMed Google Scholar
Nitesh V. Chawla
View author publications
You can also search for this author in PubMed Google Scholar
Jiawei Han
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yang Yang .

Editor information

Editors and Affiliations

Department of Computer Engineering, TOBB University, Sogutozu Cad No. 43, Sogutozu Ankara, Turkey
Tansel Özyer
Information Technologies Institute, TUBITAK BILGEM, Gebze, Kocaeli, 41470, Turkey
Zeki Erdem
Computer Science, University of Calgary, University Dr. NW 2500, Calgary, T2N 1N4, Canada
Jon Rokne
American University of Sharjah, Universities City, Sharjah, Saudi Arabia
Suheil Khoury

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Yang, Y., Sun, Y., Pandit, S., Chawla, N.V., Han, J. (2013). Perspective on Measurement Metrics for Community Detection Algorithms. In: Özyer, T., Erdem, Z., Rokne, J., Khoury, S. (eds) Mining Social Networks and Security Informatics. Lecture Notes in Social Networks. Springer, Dordrecht. https://doi.org/10.1007/978-94-007-6359-3_12

Download citation

DOI: https://doi.org/10.1007/978-94-007-6359-3_12
Publisher Name: Springer, Dordrecht
Print ISBN: 978-94-007-6358-6
Online ISBN: 978-94-007-6359-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics