ABSTRACT
Finding cohesive subgraphs is a crucial graph analysis kernel widely used for social and biological networks (graphs). There exist various approaches for discovering insightful substructures in a network, such as finding cliques, community discovery, and truss decomposition. Finding cliques is a computationally intractable problem, making it difficult to identify cohesive subgraphs in large graphs. One possible solution is k-truss decomposition, which is a relaxed form of finding cliques that can be solved in polynomial time. Further, unlike global community detection–which focuses on breaking down the entire graph into disjoint communities–a local or goal-oriented community search aims at finding the community of an entity of interest. In this work, we identify a k-truss-induced community discovery technique that can detect local communities in polynomial time. However, most previous studies have explored k-truss-induced local community formation in a serial setting, making them unsuitable for large graphs. In this paper, we design a parallel k-truss-induced local community construction method using multi-core parallelism. To the best of our knowledge, this is the first attempt to parallelize this algorithmic approach with extensive performance analysis. Our experiments demonstrate a significant performance improvement, with speedups from 19x to 55x for graphs with hundreds of millions to billions of edges, using NERSC Perlmutter compute nodes.
Supplemental Material
- Esra Akbas and Peixiang Zhao. 2017. Truss-Based Community Search: A Truss-Equivalence Based Indexing Approach. Proc. VLDB Endow. 10, 11 (aug 2017), 1298–1309. https://doi.org/10.14778/3137628.3137640Google ScholarDigital Library
- Mohammad Almasri, Omer Anjum, Carl Pearson, Zaid Qureshi, Vikram S. Mailthody, Rakesh Nagi, Jinjun Xiong, and Wen-mei Hwu. 2019. Update on k-truss Decomposition on GPU. In 2019 IEEE High Performance Extreme Computing Conference (HPEC). 1–7. https://doi.org/10.1109/HPEC.2019.8916285Google ScholarCross Ref
- Shaikh Arifuzzaman, Maleq Khan, and Madhav Marathe. 2019. Fast parallel algorithms for counting and listing triangles in big graphs. ACM Transactions on Knowledge Discovery from Data (TKDD) 14, 1 (2019), 1–34.Google Scholar
- Seung-Hee Bae, Daniel Halperin, Jevin West, Martin Rosvall, and Bill Howe. 2013. Scalable Flow-Based Community Detection for Large-Scale Network Analysis. In 2013 IEEE 13th International Conference on Data Mining Workshops. 303–310. https://doi.org/10.1109/ICDMW.2013.138Google ScholarDigital Library
- Nicola Barbieri, Francesco Bonchi, Edoardo Galimberti, and Francesco Gullo. 2015. Efficient and effective community search. Data Mining and Knowledge Discovery 29, 5 (01 Sep 2015), 1406–1433. https://doi.org/10.1007/s10618-015-0422-1Google ScholarDigital Library
- Scott Beamer, Krste Asanovic, and David Patterson. 2012. Direction-optimizing Breadth-First Search. In SC ’12: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis. 1–10. https://doi.org/10.1109/SC.2012.50Google ScholarDigital Library
- Scott Beamer, Krste Asanović, and David Patterson. 2017. The GAP Benchmark Suite. arxiv:1508.03619 [cs.DC]Google Scholar
- Vincent D Blondel, Jean-Loup Guillaume, Renaud Lambiotte, and Etienne Lefebvre. 2008. Fast unfolding of communities in large networks. Journal of Statistical Mechanics: Theory and Experiment 2008, 10 (Oct 2008), P10008. https://doi.org/10.1088/1742-5468/2008/10/p10008Google ScholarCross Ref
- Coen Bron and Joep Kerbosch. 1973. Algorithm 457: Finding All Cliques of an Undirected Graph. Commun. ACM 16, 9 (sep 1973), 575–577. https://doi.org/10.1145/362342.362367Google ScholarDigital Library
- Pei-Ling Chen, Chung-Kuang Chou, and Ming-Syan Chen. 2014. Distributed algorithms for k-truss decomposition. In 2014 IEEE International Conference on Big Data (Big Data). 471–480. https://doi.org/10.1109/BigData.2014.7004264Google ScholarCross Ref
- Jonathan Cohen. 2008. Trusses: Cohesive subgraphs for social network analysis. National security agency technical report 16, 3.1 (2008).Google Scholar
- Wanyun Cui, Yanghua Xiao, Haixun Wang, Yiqi Lu, and Wei Wang. 2013. Online Search of Overlapping Communities. In Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data (New York, New York, USA) (SIGMOD ’13). Association for Computing Machinery, New York, NY, USA, 277–288. https://doi.org/10.1145/2463676.2463722Google ScholarDigital Library
- Wanyun Cui, Yanghua Xiao, Haixun Wang, and Wei Wang. 2014. Local Search of Communities in Large Graphs. In Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data (Snowbird, Utah, USA) (SIGMOD ’14). Association for Computing Machinery, New York, NY, USA, 991–1002. https://doi.org/10.1145/2588555.2612179Google ScholarDigital Library
- Timothy A. Davis. 2018. Graph algorithms via SuiteSparse: GraphBLAS: triangle counting and K-truss. In 2018 IEEE High Performance extreme Computing Conference (HPEC). 1–6. https://doi.org/10.1109/HPEC.2018.8547538Google ScholarCross Ref
- Imre Derényi, Gergely Palla, and Tamás Vicsek. 2005. Clique Percolation in Random Networks. Phys. Rev. Lett. 94 (Apr 2005), 160202. Issue 16. https://doi.org/10.1103/PhysRevLett.94.160202Google ScholarCross Ref
- Zhihui Du, Joseph Patchett, Oliver Alvarado Rodriguez, and David A. Bader. [n. d.]. In The 9th Annual Chapel Implementers and Users Workshop (CHIUW).Google Scholar
- Mathematics Stack Exchange. [n. d.]. Number of triangles in a graph based on number of edges.https://math.stackexchange.com/questions/823481/number-of-triangles-in-a-graph-based-on-number-of-edgesGoogle Scholar
- Md Abdul Motaleb Faysal and Shaikh Arifuzzaman. 2019. Distributed community detection in large networks using an information-theoretic approach. In 2019 IEEE International Conference on Big Data (Big Data). IEEE, 4773–4782.Google ScholarCross Ref
- Md Abdul M Faysal, Shaikh Arifuzzaman, Cy Chan, Maximilian Bremer, Doru Popovici, and John Shalf. 2021. HyPC-Map: A Hybrid Parallel Community Detection Algorithm Using Information-Theoretic Approach. In 2021 IEEE High Performance Extreme Computing Conference (HPEC). IEEE, 1–8.Google Scholar
- Wafaa M. A. Habib, Hoda M. O. Mokhtar, and Mohamed E. El-Sharkawi. 2022. Discovering top-weighted k-truss communities in large graphs. Journal of Big Data 9, 1 (03 Apr 2022), 36. https://doi.org/10.1186/s40537-022-00588-1Google ScholarCross Ref
- Yujie Han and Robert A. Wagner. 1990. An Efficient and Fast Parallel-Connected Component Algorithm. J. ACM 37, 3 (jul 1990), 626–642. https://doi.org/10.1145/79147.214077Google ScholarDigital Library
- Xin Huang, Hong Cheng, Lu Qin, Wentao Tian, and Jeffrey Xu Yu. 2014. Querying K-Truss Community in Large and Dynamic Graphs. In Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data (Snowbird, Utah, USA) (SIGMOD ’14). Association for Computing Machinery, New York, NY, USA, 1311–1322. https://doi.org/10.1145/2588555.2610495Google ScholarDigital Library
- Xin Huang, Hong Cheng, Lu Qin, Wentao Tian, and Jeffrey Xu Yu. 2014. Querying K-Truss Community in Large and Dynamic Graphs. In Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data (Snowbird, Utah, USA) (SIGMOD ’14). Association for Computing Machinery, New York, NY, USA, 1311–1322. https://doi.org/10.1145/2588555.2610495Google ScholarDigital Library
- Humayun Kabir and Kamesh Madduri. 2017. Parallel k-truss decomposition on multicore systems. In 2017 IEEE High Performance Extreme Computing Conference (HPEC). 1–7. https://doi.org/10.1109/HPEC.2017.8091052Google ScholarCross Ref
- Jussi M. Kumpula, Mikko Kivelä, Kimmo Kaski, and Jari Saramäki. 2008. Sequential algorithm for fast clique percolation. Physical Review E 78, 2 (aug 2008). https://doi.org/10.1103/physreve.78.026109Google ScholarCross Ref
- Jure Leskovec and Andrej Krevl. 2014. SNAP Datasets: Stanford Large Network Dataset Collection. http://snap.stanford.edu/data.Google Scholar
- R. Duncan Luce and Albert D. Perry. 1949. A method of matrix analysis of group structure. Psychometrika 14, 2 (01 Jun 1949), 95–116. https://doi.org/10.1007/BF02289146Google ScholarCross Ref
- Suman Maity and Santanu Rath. 2014. Extended Clique percolation method to detect overlapping community structure. 2014 International Conference on Advances in Computing, Communications and Informatics (ICACCI) (2014), 31–37.Google ScholarCross Ref
- M. E. J. Newman. 2013. Spectral methods for community detection and graph partitioning. Physical Review E 88, 4 (Oct 2013). https://doi.org/10.1103/physreve.88.042822Google ScholarCross Ref
- Gergely Palla, Imre Derényi, Illés Farkas, and Tamás Vicsek. 2005. Uncovering the overlapping community structure of complex networks in nature and society. Nature 435, 7043 (01 Jun 2005), 814–818. https://doi.org/10.1038/nature03607Google ScholarCross Ref
- Roger Pearce and Geoffrey Sanders. 2018. K-truss decomposition for Scale-Free Graphs at Scale in Distributed Memory. In 2018 IEEE High Performance extreme Computing Conference (HPEC). 1–6. https://doi.org/10.1109/HPEC.2018.8547572Google ScholarCross Ref
- Martin Rosvall and Carl T Bergstrom. 2008. Maps of random walks on complex networks reveal community structure. Proceedings of the National Academy of Sciences 105, 4 (2008), 1118–1123. https://doi.org/10.1073/pnas.0706851105 arXiv:https://www.pnas.org/content/105/4/1118.full.pdfGoogle ScholarCross Ref
- Piyush Sao, Oded Green, Chirag Jain, and Richard Vuduc. 2016. A Self-Correcting Connected Components Algorithm. In Proceedings of the ACM Workshop on Fault-Tolerance for HPC at Extreme Scale (Kyoto, Japan) (FTXS ’16). Association for Computing Machinery, New York, NY, USA, 9–16. https://doi.org/10.1145/2909428.2909435Google ScholarDigital Library
- Ahmet Erdem Sarıyüce, Buğra Gedik, Gabriela Jacques-Silva, Kun-Lung Wu, and Ümit V. Çatalyürek. 2016. Incremental K-Core Decomposition: Algorithms and Evaluation. The VLDB Journal 25, 3 (jun 2016), 425–447. https://doi.org/10.1007/s00778-016-0423-8Google ScholarDigital Library
- Naw Safrin Sattar and Shaikh Arifuzzaman. 2019. Overcoming mpi communication overhead for distributed community detection. In Software Challenges to Exascale Computing: Second Workshop, SCEC 2018, Delhi, India, December 13-14, 2018, Proceedings 2. Springer Singapore, 77–90.Google Scholar
- Naw Safrin Sattar and Shaikh Arifuzzaman. 2022. Scalable distributed Louvain algorithm for community detection in large graphs. The Journal of Supercomputing 78, 7 (2022), 10275–10309.Google ScholarDigital Library
- Thomas Schank and Dorothea Wagner. 2005. Finding, Counting and Listing All Triangles in Large Graphs, an Experimental Study. In Experimental and Efficient Algorithms, Sotiris E. Nikoletseas (Ed.). Springer Berlin Heidelberg, Berlin, Heidelberg, 606–609.Google Scholar
- Hua-Wei Shen, Xue-Qi Cheng, and Jia-Feng Guo. 2009. Quantifying and identifying the overlapping community structure in networks. Journal of Statistical Mechanics: Theory and Experiment 2009, 07 (jul 2009), P07042. https://doi.org/10.1088/1742-5468/2009/07/p07042Google ScholarCross Ref
- Yossi Shiloach and Uzi Vishkin. 1982. An O(log n) Parallel Connectivity Algorithm. J. Algorithms 3 (1982), 57–67.Google ScholarCross Ref
- George M. Slota, Sivasankaran Rajamanickam, and Kamesh Madduri. 2014. BFS and Coloring-Based Parallel Algorithms for Strongly Connected Components and Related Problems. In 2014 IEEE 28th International Parallel and Distributed Processing Symposium. 550–559. https://doi.org/10.1109/IPDPS.2014.64Google ScholarDigital Library
- Shaden Smith, Xing Liu, Nesreen K. Ahmed, Ancy Sarah Tom, Fabrizio Petrini, and George Karypis. 2017. Truss decomposition on shared-memory parallel systems. In 2017 IEEE High Performance Extreme Computing Conference (HPEC). 1–6. https://doi.org/10.1109/HPEC.2017.8091049Google ScholarCross Ref
- Mauro Sozio and Aristides Gionis. 2010. The Community-Search Problem and How to Plan a Successful Cocktail Party. In Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (Washington, DC, USA) (KDD ’10). Association for Computing Machinery, New York, NY, USA, 939–948. https://doi.org/10.1145/1835804.1835923Google ScholarDigital Library
- Michael Sutton, Tal Ben-Nun, and Amnon Barak. 2018. Optimizing Parallel Graph Connectivity Computation via Subgraph Sampling. In 2018 IEEE International Parallel and Distributed Processing Symposium (IPDPS). 12–21. https://doi.org/10.1109/IPDPS.2018.00012Google ScholarCross Ref
- Charalampos Tsourakakis, Francesco Bonchi, Aristides Gionis, Francesco Gullo, and Maria Tsiarli. 2013. Denser than the Densest Subgraph: Extracting Optimal Quasi-Cliques with Quality Guarantees. In Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (Chicago, Illinois, USA) (KDD ’13). Association for Computing Machinery, New York, NY, USA, 104–112. https://doi.org/10.1145/2487575.2487645Google ScholarDigital Library
- Jia Wang and James Cheng. 2012. Truss Decomposition in Massive Networks. Proc. VLDB Endow. 5, 9 (may 2012), 812–823. https://doi.org/10.14778/2311906.2311909Google ScholarDigital Library
- Runze Wang, Linchen Yu, Qinggang Wang, Jie Xin, and Long Zheng. 2021. Productive High-Performance k-Truss Decomposition on GPU Using Linear Algebra. In 2021 IEEE High Performance Extreme Computing Conference (HPEC). 1–7. https://doi.org/10.1109/HPEC49654.2021.9622792Google ScholarCross Ref
- Jian Wu, Alison Goshulak, Venkatesh Srinivasan, and Alex Thomo. 2018. K-Truss Decomposition of Large Networks on a Single Consumer-Grade Machine. In 2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM). 873–880. https://doi.org/10.1109/ASONAM.2018.8508642Google ScholarCross Ref
- Peng Wu and Li Pan. 2014. Detecting highly overlapping community structure based on Maximal Clique Networks. In 2014 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2014). 196–199. https://doi.org/10.1109/ASONAM.2014.6921582Google ScholarCross Ref
- Yubao Wu, Ruoming Jin, Jing Li, and Xiang Zhang. 2015. Robust Local Community Detection: On Free Rider Effect and Its Elimination. Proc. VLDB Endow. 8, 7 (feb 2015), 798–809. https://doi.org/10.14778/2752939.2752948Google ScholarDigital Library
- Da Yan, James Cheng, Kai Xing, Yi Lu, Wilfred Ng, and Yingyi Bu. 2014. Pregel Algorithms for Graph Connectivity Problems with Performance Guarantees. Proc. VLDB Endow. 7, 14 (oct 2014), 1821–1832. https://doi.org/10.14778/2733085.2733089Google ScholarDigital Library
- Shihua Zhang, Xuemei Ning, and Xiang-Sun Zhang. 2006. Identification of functional modules in a PPI network by clique percolation clustering. Comput Biol Chem 30, 6 (Nov. 2006), 445–451.Google ScholarDigital Library
Index Terms
- Fast Parallel Index Construction for Efficient K-truss-based Local Community Detection in Large Graphs
Recommendations
Querying k-truss community in large and dynamic graphs
SIGMOD '14: Proceedings of the 2014 ACM SIGMOD International Conference on Management of DataCommunity detection which discovers densely connected structures in a network has been studied a lot. In this paper, we study online community search which is practically useful but less studied in the literature. Given a query vertex in a graph, the ...
I/O efficient k-truss community search in massive graphs
AbstractCommunity detection that discovers all densely connected communities in a network has been studied a lot. In this paper, we study online communitysearch for query-dependent communities, which is a different but practically useful task. Given a ...
L(2,1)-labeling of dually chordal graphs and strongly orderable graphs
An L(2,1)-labeling of a graph G=(V,E) is a function f:V(G)->{0,1,2,...} such that |f(u)-f(v)|>=2 whenever uv@__ __E(G) and |f(u)-f(v)|>=1 whenever u and v are at distance two apart. The span of an L(2,1)-labeling f of G, denoted as SP"2(f,G), is the ...
Comments