skip to main content
10.1145/3284103.3284111acmconferencesArticle/Chapter ViewAbstractPublication PagesgisConference Proceedingsconference-collections
research-article

A Degree-based Distributed Label Propagation Algorithm for Community Detection in Networks

Published: 06 November 2018 Publication History

Abstract

Community detection is an important way to understand the functions and characteristics of complex systems. The Label Propagation Algorithm (LPA) is a community detection algorithm of which complexity is close to linear. Due to the randomness and instability of the algorithm, it is difficult to get good results with this algorithm. This paper proposes a degree-based label propagation algorithm and parallelizes the algorithm based on the GraphX component in Spark platform (D-disLPA). Using WeChat empirical network data, this paper adopts modularity as an evaluation index to compare D-disLPA with traditional label propagation algorithms. The experimental results show that the D-disLPA can effectively solve the non-convergence problem in traditional algorithms and improve the stability and accuracy of community detection. At the same time, this distributed algorithm is able to satisfy the requirements of large-scale community detection.

References

[1]
JR Banavar, A Maritan, and A Rinaldo. 1999. Size and form in efficient transportation networks. Nature 399, 6732 (May 1999), 130âĂŤ132.
[2]
M.J. Barber and J. W. Clark. 2009. Detecting network communities by propagating labels under constraints. pre 80, 2 (aug 2009), 026129.
[3]
Anthony Chen, Hai Yang, Hong K. Lo, and Wilson H. Tang. {n. d.}. A capacity related reliability for transportation networks. Journal of Advanced Transportation 33, 2 ({n. d.}), 183--200. arXiv:https://onlinelibrary.wiley.com/doi/pdf/10.1002/atr.5670330207
[4]
A. Chuan, B. Chen, L. Liu, J. Dong, L. Hey, and X. Qiu. 2018. Design and Implementation of Information Dissemination Simulation Algorithm in Large-Scale Complex Network Based on Spark. In 2018 IEEE Third International Conference on Data Science in Cyberspace (DSC), Vol. 00. 457--464.
[5]
G. Cordasco and L. Gargano. 2011. Community Detection via Semi-Synchronous Label Propagation Algorithms. ArXiv e-prints (March 2011). arXiv:1103.4550
[6]
Jeffrey Dean and Sanjay Ghemawat. 2004. MapReduce: Simplified Data Processing on Large Clusters. In Proceedings of the 6th Conference on Symposium on Opearting Systems Design & Implementation - Volume 6 (OSDI'04). USENIX Association, Berkeley, CA, USA, 10--10. http://dl.acm.org/citation.cfm?id=1251254.1251264
[7]
Santo Fortunato. 2010. Community detection in graphs. Physics Reports 486, 3 (2010), 75--174.
[8]
Linton C. Freeman. 1978. Centrality in social networks conceptual clarification. Social Networks 1, 3 (1978), 215--239.
[9]
Alexandros V. Gerbessiotis and Leslie G. Valiant. 1992. Direct Bulk-Synchronous Parallel Algorithms. In Proceedings of the Third Scandinavian Workshop on Algorithm Theory (SWAT '92). Springer-Verlag, London, UK, UK, 1--18. http://dl.acm.org/citation.cfm?id=645896.671952
[10]
M. Girvan and M. E. J. Newman. 2002. Community structure in social and biological networks. Proceedings of the National Academy of Sciences 99, 12 (2002), 7821--7826. arXiv:http://www.pnas.org/content/99/12/7821.full.pdf
[11]
D. E. GOLDBERG. 1989. Genetic Algorithm in Search Optimization and Machine Learning. Addison Wesley (1989). https://ci.nii.ac.jp/naid/10006087555/en/
[12]
Wei Liu, Matteo Pellegrini, and Xiaofan Wang. 2014. Detecting communities based on network topology. Scientific reports 4 (2014), 5739.
[13]
Grzegorz Malewicz, Matthew H. Austern, Aart J.C Bik, James C. Dehnert, Ilan Horn, Naty Leiser, and Grzegorz Czajkowski. 2010. Pregel: A System for Large-scale Graph Processing. In Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data (SIGMOD '10). ACM, New York, NY, USA, 135--146.
[14]
Alan Mislove, Massimiliano Marcon, Krishna P. Gummadi, Peter Druschel, and Bobby Bhattacharjee. 2007. Measurement and Analysis of Online Social Networks. In Proceedings of the 7th ACM SIGCOMM Conference on Internet Measurement (IMC '07). ACM, New York, NY, USA, 29--42.
[15]
ME Newman. 2004. Fast algorithm for detecting community structure in networks. Physical review. E, Statistical, nonlinear, and soft matter physics 69, 6 Pt 2 (June 2004), 066133.
[16]
ME Newman and M Girvan. 2004. Finding and evaluating community structure in networks. Physical review. E, Statistical, nonlinear, and soft matter physics 69, 2 Pt 2 (February 2004), 026113.
[17]
G. Palla, I. Derényi, I. Farkas, and T. Vicsek. 2005. Uncovering the overlapping community structure of complex networks in nature and society. nat 435 (2005), 814--818.
[18]
Alex Pothen, Horst D. Simon, and Kan-Pu Liou. 1990. Partitioning Sparse Matrices with Eigenvectors of Graphs. SIAM J. Matrix Anal. Appl. 11, 3 (May 1990), 430--452.
[19]
Usha Nandini Raghavan, RÃl'ka Albert, and Soundar Kumara. 2007. Near linear time algorithm to detect community structures in large-scale networks. Physical review. E, Statistical, nonlinear, and soft matter physics 76, 3 Pt 2 (2007), 036106.
[20]
Eric E Schadt. 2009. Molecular networks as sensors and drivers of common human diseases. Nature 461, 7261 (September 2009), 218âĂŤ223.
[21]
Philipp Schuetz and Amedeo Caflisch. 2008. Efficient modularity optimization by multistep greedy algorithm and vertex mover refinement. Physical review. E, Statistical, nonlinear, and soft matter physics 77, 4 Pt 2 (April 2008), 046112.
[22]
Vinod Kumar Vavilapalli, Arun C. Murthy, Chris Douglas, Sharad Agarwal, Mahadev Konar, Robert Evans, Thomas Graves, Jason Lowe, Hitesh Shah, Siddharth Seth, Bikas Saha, Carlo Curino, Owen O'Malley, Sanjay Radia, Benjamin Reed, and Eric Baldeschwieler. 2013. Apache Hadoop YARN: Yet Another Resource Negotiator. In Proceedings of the 4th Annual Symposium on Cloud Computing (SOCC '13). ACM, New York, NY, USA, Article 5, 16 pages.
[23]
Jierui Xie and Boleslaw K. Szymanski. 2011. Community Detection Using a Neighborhood Strength Driven Label Propagation Algorithm. In Proceedings of the 2011 IEEE Network Science Workshop (NSW '11). IEEE Computer Society, Washington, DC, USA, 188--195.
[24]
Reynold S. Xin, Joseph E. Gonzalez, Michael J. Franklin, and Ion Stoica. 2013. GraphX: A Resilient Distributed Graph System on Spark. In First International Workshop on Graph Data Management Experiences and Systems (GRADES '13). ACM, New York, NY, USA, Article 2, 6 pages.
[25]
Matei Zaharia, Mosharaf Chowdhury, Michael J. Franklin, Scott Shenker, and Ion Stoica. 2010. Spark: Cluster Computing with Working Sets. In Proceedings of the 2Nd USENIX Conference on Hot Topics in Cloud Computing (HotCloud'10). USENIX Association, Berkeley, CA, USA, 10--10. http://dl.acm.org/citation.cfm?id=1863103.1863113

Cited By

View all
  • (2024)Scalable Spatio-temporal Top-k Interaction Queries on Dynamic CommunitiesACM Transactions on Spatial Algorithms and Systems10.1145/364837410:1(1-25)Online publication date: 16-Feb-2024
  • (2021)A community detection algorithm based on Quasi-Laplacian centrality peaks clusteringApplied Intelligence10.1007/s10489-021-02278-6Online publication date: 19-Mar-2021
  • (2020)Influence propagation based community detection in complex networksMachine Learning with Applications10.1016/j.mlwa.2020.100019(100019)Online publication date: Dec-2020

Index Terms

  1. A Degree-based Distributed Label Propagation Algorithm for Community Detection in Networks

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    Safety and Resilience'18: Proceedings of the 4th ACM SIGSPATIAL International Workshop on Safety and Resilience
    November 2018
    129 pages
    ISBN:9781450360449
    DOI:10.1145/3284103
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 06 November 2018

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Community detection
    2. D-disLPA
    3. Label Propagation Algorithm
    4. Pregel

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Conference

    SIGSPATIAL '18
    Sponsor:

    Acceptance Rates

    Safety and Resilience'18 Paper Acceptance Rate 22 of 38 submissions, 58%;
    Overall Acceptance Rate 22 of 38 submissions, 58%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)5
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 02 Mar 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Scalable Spatio-temporal Top-k Interaction Queries on Dynamic CommunitiesACM Transactions on Spatial Algorithms and Systems10.1145/364837410:1(1-25)Online publication date: 16-Feb-2024
    • (2021)A community detection algorithm based on Quasi-Laplacian centrality peaks clusteringApplied Intelligence10.1007/s10489-021-02278-6Online publication date: 19-Mar-2021
    • (2020)Influence propagation based community detection in complex networksMachine Learning with Applications10.1016/j.mlwa.2020.100019(100019)Online publication date: Dec-2020

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media