skip to main content
10.1145/3605573.3605637acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicppConference Proceedingsconference-collections
research-article
Open Access

Fast Parallel Index Construction for Efficient K-truss-based Local Community Detection in Large Graphs

Published:13 September 2023Publication History

ABSTRACT

Finding cohesive subgraphs is a crucial graph analysis kernel widely used for social and biological networks (graphs). There exist various approaches for discovering insightful substructures in a network, such as finding cliques, community discovery, and truss decomposition. Finding cliques is a computationally intractable problem, making it difficult to identify cohesive subgraphs in large graphs. One possible solution is k-truss decomposition, which is a relaxed form of finding cliques that can be solved in polynomial time. Further, unlike global community detection–which focuses on breaking down the entire graph into disjoint communities–a local or goal-oriented community search aims at finding the community of an entity of interest. In this work, we identify a k-truss-induced community discovery technique that can detect local communities in polynomial time. However, most previous studies have explored k-truss-induced local community formation in a serial setting, making them unsuitable for large graphs. In this paper, we design a parallel k-truss-induced local community construction method using multi-core parallelism. To the best of our knowledge, this is the first attempt to parallelize this algorithmic approach with extensive performance analysis. Our experiments demonstrate a significant performance improvement, with speedups from 19x to 55x for graphs with hundreds of millions to billions of edges, using NERSC Perlmutter compute nodes.

Skip Supplemental Material Section

Supplemental Material

References

  1. Esra Akbas and Peixiang Zhao. 2017. Truss-Based Community Search: A Truss-Equivalence Based Indexing Approach. Proc. VLDB Endow. 10, 11 (aug 2017), 1298–1309. https://doi.org/10.14778/3137628.3137640Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Mohammad Almasri, Omer Anjum, Carl Pearson, Zaid Qureshi, Vikram S. Mailthody, Rakesh Nagi, Jinjun Xiong, and Wen-mei Hwu. 2019. Update on k-truss Decomposition on GPU. In 2019 IEEE High Performance Extreme Computing Conference (HPEC). 1–7. https://doi.org/10.1109/HPEC.2019.8916285Google ScholarGoogle ScholarCross RefCross Ref
  3. Shaikh Arifuzzaman, Maleq Khan, and Madhav Marathe. 2019. Fast parallel algorithms for counting and listing triangles in big graphs. ACM Transactions on Knowledge Discovery from Data (TKDD) 14, 1 (2019), 1–34.Google ScholarGoogle Scholar
  4. Seung-Hee Bae, Daniel Halperin, Jevin West, Martin Rosvall, and Bill Howe. 2013. Scalable Flow-Based Community Detection for Large-Scale Network Analysis. In 2013 IEEE 13th International Conference on Data Mining Workshops. 303–310. https://doi.org/10.1109/ICDMW.2013.138Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Nicola Barbieri, Francesco Bonchi, Edoardo Galimberti, and Francesco Gullo. 2015. Efficient and effective community search. Data Mining and Knowledge Discovery 29, 5 (01 Sep 2015), 1406–1433. https://doi.org/10.1007/s10618-015-0422-1Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Scott Beamer, Krste Asanovic, and David Patterson. 2012. Direction-optimizing Breadth-First Search. In SC ’12: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis. 1–10. https://doi.org/10.1109/SC.2012.50Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Scott Beamer, Krste Asanović, and David Patterson. 2017. The GAP Benchmark Suite. arxiv:1508.03619 [cs.DC]Google ScholarGoogle Scholar
  8. Vincent D Blondel, Jean-Loup Guillaume, Renaud Lambiotte, and Etienne Lefebvre. 2008. Fast unfolding of communities in large networks. Journal of Statistical Mechanics: Theory and Experiment 2008, 10 (Oct 2008), P10008. https://doi.org/10.1088/1742-5468/2008/10/p10008Google ScholarGoogle ScholarCross RefCross Ref
  9. Coen Bron and Joep Kerbosch. 1973. Algorithm 457: Finding All Cliques of an Undirected Graph. Commun. ACM 16, 9 (sep 1973), 575–577. https://doi.org/10.1145/362342.362367Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Pei-Ling Chen, Chung-Kuang Chou, and Ming-Syan Chen. 2014. Distributed algorithms for k-truss decomposition. In 2014 IEEE International Conference on Big Data (Big Data). 471–480. https://doi.org/10.1109/BigData.2014.7004264Google ScholarGoogle ScholarCross RefCross Ref
  11. Jonathan Cohen. 2008. Trusses: Cohesive subgraphs for social network analysis. National security agency technical report 16, 3.1 (2008).Google ScholarGoogle Scholar
  12. Wanyun Cui, Yanghua Xiao, Haixun Wang, Yiqi Lu, and Wei Wang. 2013. Online Search of Overlapping Communities. In Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data (New York, New York, USA) (SIGMOD ’13). Association for Computing Machinery, New York, NY, USA, 277–288. https://doi.org/10.1145/2463676.2463722Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Wanyun Cui, Yanghua Xiao, Haixun Wang, and Wei Wang. 2014. Local Search of Communities in Large Graphs. In Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data (Snowbird, Utah, USA) (SIGMOD ’14). Association for Computing Machinery, New York, NY, USA, 991–1002. https://doi.org/10.1145/2588555.2612179Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Timothy A. Davis. 2018. Graph algorithms via SuiteSparse: GraphBLAS: triangle counting and K-truss. In 2018 IEEE High Performance extreme Computing Conference (HPEC). 1–6. https://doi.org/10.1109/HPEC.2018.8547538Google ScholarGoogle ScholarCross RefCross Ref
  15. Imre Derényi, Gergely Palla, and Tamás Vicsek. 2005. Clique Percolation in Random Networks. Phys. Rev. Lett. 94 (Apr 2005), 160202. Issue 16. https://doi.org/10.1103/PhysRevLett.94.160202Google ScholarGoogle ScholarCross RefCross Ref
  16. Zhihui Du, Joseph Patchett, Oliver Alvarado Rodriguez, and David A. Bader. [n. d.]. In The 9th Annual Chapel Implementers and Users Workshop (CHIUW).Google ScholarGoogle Scholar
  17. Mathematics Stack Exchange. [n. d.]. Number of triangles in a graph based on number of edges.https://math.stackexchange.com/questions/823481/number-of-triangles-in-a-graph-based-on-number-of-edgesGoogle ScholarGoogle Scholar
  18. Md Abdul Motaleb Faysal and Shaikh Arifuzzaman. 2019. Distributed community detection in large networks using an information-theoretic approach. In 2019 IEEE International Conference on Big Data (Big Data). IEEE, 4773–4782.Google ScholarGoogle ScholarCross RefCross Ref
  19. Md Abdul M Faysal, Shaikh Arifuzzaman, Cy Chan, Maximilian Bremer, Doru Popovici, and John Shalf. 2021. HyPC-Map: A Hybrid Parallel Community Detection Algorithm Using Information-Theoretic Approach. In 2021 IEEE High Performance Extreme Computing Conference (HPEC). IEEE, 1–8.Google ScholarGoogle Scholar
  20. Wafaa M. A. Habib, Hoda M. O. Mokhtar, and Mohamed E. El-Sharkawi. 2022. Discovering top-weighted k-truss communities in large graphs. Journal of Big Data 9, 1 (03 Apr 2022), 36. https://doi.org/10.1186/s40537-022-00588-1Google ScholarGoogle ScholarCross RefCross Ref
  21. Yujie Han and Robert A. Wagner. 1990. An Efficient and Fast Parallel-Connected Component Algorithm. J. ACM 37, 3 (jul 1990), 626–642. https://doi.org/10.1145/79147.214077Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Xin Huang, Hong Cheng, Lu Qin, Wentao Tian, and Jeffrey Xu Yu. 2014. Querying K-Truss Community in Large and Dynamic Graphs. In Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data (Snowbird, Utah, USA) (SIGMOD ’14). Association for Computing Machinery, New York, NY, USA, 1311–1322. https://doi.org/10.1145/2588555.2610495Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Xin Huang, Hong Cheng, Lu Qin, Wentao Tian, and Jeffrey Xu Yu. 2014. Querying K-Truss Community in Large and Dynamic Graphs. In Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data (Snowbird, Utah, USA) (SIGMOD ’14). Association for Computing Machinery, New York, NY, USA, 1311–1322. https://doi.org/10.1145/2588555.2610495Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Humayun Kabir and Kamesh Madduri. 2017. Parallel k-truss decomposition on multicore systems. In 2017 IEEE High Performance Extreme Computing Conference (HPEC). 1–7. https://doi.org/10.1109/HPEC.2017.8091052Google ScholarGoogle ScholarCross RefCross Ref
  25. Jussi M. Kumpula, Mikko Kivelä, Kimmo Kaski, and Jari Saramäki. 2008. Sequential algorithm for fast clique percolation. Physical Review E 78, 2 (aug 2008). https://doi.org/10.1103/physreve.78.026109Google ScholarGoogle ScholarCross RefCross Ref
  26. Jure Leskovec and Andrej Krevl. 2014. SNAP Datasets: Stanford Large Network Dataset Collection. http://snap.stanford.edu/data.Google ScholarGoogle Scholar
  27. R. Duncan Luce and Albert D. Perry. 1949. A method of matrix analysis of group structure. Psychometrika 14, 2 (01 Jun 1949), 95–116. https://doi.org/10.1007/BF02289146Google ScholarGoogle ScholarCross RefCross Ref
  28. Suman Maity and Santanu Rath. 2014. Extended Clique percolation method to detect overlapping community structure. 2014 International Conference on Advances in Computing, Communications and Informatics (ICACCI) (2014), 31–37.Google ScholarGoogle ScholarCross RefCross Ref
  29. M. E. J. Newman. 2013. Spectral methods for community detection and graph partitioning. Physical Review E 88, 4 (Oct 2013). https://doi.org/10.1103/physreve.88.042822Google ScholarGoogle ScholarCross RefCross Ref
  30. Gergely Palla, Imre Derényi, Illés Farkas, and Tamás Vicsek. 2005. Uncovering the overlapping community structure of complex networks in nature and society. Nature 435, 7043 (01 Jun 2005), 814–818. https://doi.org/10.1038/nature03607Google ScholarGoogle ScholarCross RefCross Ref
  31. Roger Pearce and Geoffrey Sanders. 2018. K-truss decomposition for Scale-Free Graphs at Scale in Distributed Memory. In 2018 IEEE High Performance extreme Computing Conference (HPEC). 1–6. https://doi.org/10.1109/HPEC.2018.8547572Google ScholarGoogle ScholarCross RefCross Ref
  32. Martin Rosvall and Carl T Bergstrom. 2008. Maps of random walks on complex networks reveal community structure. Proceedings of the National Academy of Sciences 105, 4 (2008), 1118–1123. https://doi.org/10.1073/pnas.0706851105 arXiv:https://www.pnas.org/content/105/4/1118.full.pdfGoogle ScholarGoogle ScholarCross RefCross Ref
  33. Piyush Sao, Oded Green, Chirag Jain, and Richard Vuduc. 2016. A Self-Correcting Connected Components Algorithm. In Proceedings of the ACM Workshop on Fault-Tolerance for HPC at Extreme Scale (Kyoto, Japan) (FTXS ’16). Association for Computing Machinery, New York, NY, USA, 9–16. https://doi.org/10.1145/2909428.2909435Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Ahmet Erdem Sarıyüce, Buğra Gedik, Gabriela Jacques-Silva, Kun-Lung Wu, and Ümit V. Çatalyürek. 2016. Incremental K-Core Decomposition: Algorithms and Evaluation. The VLDB Journal 25, 3 (jun 2016), 425–447. https://doi.org/10.1007/s00778-016-0423-8Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Naw Safrin Sattar and Shaikh Arifuzzaman. 2019. Overcoming mpi communication overhead for distributed community detection. In Software Challenges to Exascale Computing: Second Workshop, SCEC 2018, Delhi, India, December 13-14, 2018, Proceedings 2. Springer Singapore, 77–90.Google ScholarGoogle Scholar
  36. Naw Safrin Sattar and Shaikh Arifuzzaman. 2022. Scalable distributed Louvain algorithm for community detection in large graphs. The Journal of Supercomputing 78, 7 (2022), 10275–10309.Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Thomas Schank and Dorothea Wagner. 2005. Finding, Counting and Listing All Triangles in Large Graphs, an Experimental Study. In Experimental and Efficient Algorithms, Sotiris E. Nikoletseas (Ed.). Springer Berlin Heidelberg, Berlin, Heidelberg, 606–609.Google ScholarGoogle Scholar
  38. Hua-Wei Shen, Xue-Qi Cheng, and Jia-Feng Guo. 2009. Quantifying and identifying the overlapping community structure in networks. Journal of Statistical Mechanics: Theory and Experiment 2009, 07 (jul 2009), P07042. https://doi.org/10.1088/1742-5468/2009/07/p07042Google ScholarGoogle ScholarCross RefCross Ref
  39. Yossi Shiloach and Uzi Vishkin. 1982. An O(log n) Parallel Connectivity Algorithm. J. Algorithms 3 (1982), 57–67.Google ScholarGoogle ScholarCross RefCross Ref
  40. George M. Slota, Sivasankaran Rajamanickam, and Kamesh Madduri. 2014. BFS and Coloring-Based Parallel Algorithms for Strongly Connected Components and Related Problems. In 2014 IEEE 28th International Parallel and Distributed Processing Symposium. 550–559. https://doi.org/10.1109/IPDPS.2014.64Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Shaden Smith, Xing Liu, Nesreen K. Ahmed, Ancy Sarah Tom, Fabrizio Petrini, and George Karypis. 2017. Truss decomposition on shared-memory parallel systems. In 2017 IEEE High Performance Extreme Computing Conference (HPEC). 1–6. https://doi.org/10.1109/HPEC.2017.8091049Google ScholarGoogle ScholarCross RefCross Ref
  42. Mauro Sozio and Aristides Gionis. 2010. The Community-Search Problem and How to Plan a Successful Cocktail Party. In Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (Washington, DC, USA) (KDD ’10). Association for Computing Machinery, New York, NY, USA, 939–948. https://doi.org/10.1145/1835804.1835923Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Michael Sutton, Tal Ben-Nun, and Amnon Barak. 2018. Optimizing Parallel Graph Connectivity Computation via Subgraph Sampling. In 2018 IEEE International Parallel and Distributed Processing Symposium (IPDPS). 12–21. https://doi.org/10.1109/IPDPS.2018.00012Google ScholarGoogle ScholarCross RefCross Ref
  44. Charalampos Tsourakakis, Francesco Bonchi, Aristides Gionis, Francesco Gullo, and Maria Tsiarli. 2013. Denser than the Densest Subgraph: Extracting Optimal Quasi-Cliques with Quality Guarantees. In Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (Chicago, Illinois, USA) (KDD ’13). Association for Computing Machinery, New York, NY, USA, 104–112. https://doi.org/10.1145/2487575.2487645Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Jia Wang and James Cheng. 2012. Truss Decomposition in Massive Networks. Proc. VLDB Endow. 5, 9 (may 2012), 812–823. https://doi.org/10.14778/2311906.2311909Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Runze Wang, Linchen Yu, Qinggang Wang, Jie Xin, and Long Zheng. 2021. Productive High-Performance k-Truss Decomposition on GPU Using Linear Algebra. In 2021 IEEE High Performance Extreme Computing Conference (HPEC). 1–7. https://doi.org/10.1109/HPEC49654.2021.9622792Google ScholarGoogle ScholarCross RefCross Ref
  47. Jian Wu, Alison Goshulak, Venkatesh Srinivasan, and Alex Thomo. 2018. K-Truss Decomposition of Large Networks on a Single Consumer-Grade Machine. In 2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM). 873–880. https://doi.org/10.1109/ASONAM.2018.8508642Google ScholarGoogle ScholarCross RefCross Ref
  48. Peng Wu and Li Pan. 2014. Detecting highly overlapping community structure based on Maximal Clique Networks. In 2014 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2014). 196–199. https://doi.org/10.1109/ASONAM.2014.6921582Google ScholarGoogle ScholarCross RefCross Ref
  49. Yubao Wu, Ruoming Jin, Jing Li, and Xiang Zhang. 2015. Robust Local Community Detection: On Free Rider Effect and Its Elimination. Proc. VLDB Endow. 8, 7 (feb 2015), 798–809. https://doi.org/10.14778/2752939.2752948Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Da Yan, James Cheng, Kai Xing, Yi Lu, Wilfred Ng, and Yingyi Bu. 2014. Pregel Algorithms for Graph Connectivity Problems with Performance Guarantees. Proc. VLDB Endow. 7, 14 (oct 2014), 1821–1832. https://doi.org/10.14778/2733085.2733089Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Shihua Zhang, Xuemei Ning, and Xiang-Sun Zhang. 2006. Identification of functional modules in a PPI network by clique percolation clustering. Comput Biol Chem 30, 6 (Nov. 2006), 445–451.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Fast Parallel Index Construction for Efficient K-truss-based Local Community Detection in Large Graphs

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Other conferences
        ICPP '23: Proceedings of the 52nd International Conference on Parallel Processing
        August 2023
        858 pages
        ISBN:9798400708435
        DOI:10.1145/3605573

        Copyright © 2023 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 13 September 2023

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed limited

        Acceptance Rates

        Overall Acceptance Rate91of313submissions,29%
      • Article Metrics

        • Downloads (Last 12 months)227
        • Downloads (Last 6 weeks)54

        Other Metrics

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format .

      View HTML Format