Skip to main content
Log in

VPC: Pruning connected components using vector-based path compression for Graph500

  • Regular Paper
  • Published:
CCF Transactions on High Performance Computing Aims and scope Submit manuscript

A Correction to this article was published on 15 November 2021

This article has been updated

Abstract

Graphs are an effective approach for data representation and organization, and graph analysis is a promising killer application for AI systems. However, recently emerging extremely large graphs (consisting of trillions of vertices and edges) exceed the capacity of any small-/medium-scale clusters and thus necessitate the adoption of supercomputers for efficient graph processing. Graph500 is the de facto standard for benchmarking supercomputers’ graph processing performance, and connected component (CC) is an important basic algorithm for Graph500’s BFS and SSSP tests. However, current CC algorithms are inefficient on supercomputers and fast CC is expensive and challenging. In this paper, we propose VPC, an efficient method that prunes connected components using vector-based path compression. It includes the following innovations: (i) The data structure of the traversal algorithm is customized with the two-dimensional adjacency vector. (ii) The vector-based path compression is proposed for the union-find algorithm. (iii) Parallel VPC is proposed customized with Tianhe. Experimental results validate that the two-dimensional adjacency vector has better performance than other data structures and the vector-based path compression is used in the realization of the union-find algorithm. When the scale is 26, the performance of our algorithm is 1.38\(\times\), 1.69\(\times\) and 2.57\(\times\) that of other algorithms. The union-find algorithm is used for connected components, and the performance of the algorithm is 5.14\(\times\) and 5.01\(\times\) that of BFS and DFS respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Change history

References

  • Albert, R.: Scale-free networks in cell biology Scale-free networks in cell biology. J. Cell Sci. 118(21), 4947–4957 (2005)

    Article  Google Scholar 

  • Andoni, A., Song, Z., Stein, C., Wang, Z., Zhong, P.: Parallel graph connectivity in log diameter rounds. In 2018 IEEE 59th Annual Symposium on Foundations of Computer Science (FOCS), pp. 674–685 (2018)

  • Awerbuch, B., Shiloach, Y.: New connectivity and MSF algorithms for shuffle-exchange network and PRAM New connectivity and msf algorithms for shuffle-exchange network and pram. IEEE Comput. Archit. Lett. 36(10), 1258–1263 (1987)

  • Azad, A., Buluç, A.: LACC: a linear-algebraic algorithm for finding connected components in distributed memory Lacc: a linear-algebraic algorithm for finding connected components in distributed memory. In 2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS), pp. 2–12 (2019)

  • Buluç, A., Mattson, T., McMillan, S., Moreira, J., Yang, C.: Design of the GraphBLAS API for C Design of the graphblas api for c. In 2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 643–652 (2017)

  • Chen, R., Shi, J., Chen, Y., Zang, B., Guan, H., Chen, H.: Powerlyra: Differentiated graph computation and partitioning on skewed graphs. ACM Trans. Parallel Comput. (TOPC) 5(3), 1–39 (2019)

    Article  Google Scholar 

  • Cormen, T.H., Leiserson, C.E., Rivest, R.L., Stein, C.: Introduction to algorithms. MIT press, Cambridge (2009)

    MATH  Google Scholar 

  • Everitt, T., Hutter, M.: Universal artificial intelligence. In: Foundations of trusted autonomy, pp. 15–46. Springer (2018)

  • Fich, F.E.: The complexity of computation on the parallel random access machine. Citeseer (1993)

    Google Scholar 

  • Gazit, H.: An optimal randomized parallel algorithm for finding connected components in a graph. SIAM J. Comput. 20(6), 1046–1067 (1991)

    Article  MathSciNet  Google Scholar 

  • Giani, A., Bitar, E., Garcia, M., McQueen, M., Khar-gonekar, P.P., Poolla, K.: Smart grid data integrity attacks. IEEE Trans. Smart Grid 4(3), 1244–1253 (2013)

    Article  Google Scholar 

  • Gonzalez, J.E., Low, Y., Gu, H., Bickson, D., Guestrin, C.: Powergraph: Distributed graph-parallel computation on natural graphs. In: 10th fUSENIXg Symposium on Operating Systems Design and Implementation (fOSDIg 12), pp. 17–30 (2012)

  • Gonzalez, J. E., Xin, R. S., Dave, A., Crankshaw, D., Franklin, M. J., Stoica, I.: Graphx: Graph processing in a distributed data flow frame-work. In 11th \(\{\)USENIX\(\}\) Symposium on Operating Systems Design and Implementation (\(\{\)OSDI\(\}\) 14) 11th \(\{\)USENIX\(\}\) symposium on operating systems design and implementation (\(\{\)OSDI\(\}\) 14), pp. 599–613 (2014)

  • Halperin, S., Zwick, U.: An optimal ran- domised logarithmic time connectivity algorithm for the erew pram. J. Comput. Syst. Sci. 53(3), 395–416 (1996)

    Article  Google Scholar 

  • He, L., Chao, Y., Suzuki, K., Wu, K.: Fast connected-component labeling. Pattern Recogn. 42(9), 1977–1987 (2009)

    Article  Google Scholar 

  • Hirschberg, D.S., Chandra, A.K., Sarwate, D.V.: Computing connected components on parallel computers. Commun. ACM 22(8), 461–464 (1979)

    Article  MathSciNet  Google Scholar 

  • Hogan, E., Hui, P., Choudhury, S., Halappanavar, M., Oler, K., Joslyn, C.: Towards a multi-scale approach to cybersecurity modeling. In: 2013 IEEE International Conference on Technologies for Homeland Security (hst), pp. 80–85 (2013)

  • Hopcroft, J., Tarjan, R.: Algorithm 447: efficient algorithms for graph manipulation. Commun. ACM 16(6), 372–378 (1973)

    Article  Google Scholar 

  • Hopcroft, J.E., Ullman, J.D.: Set merging algorithms. SIAM J. Comput. 2(4), 294–303 (1973)

    Article  MathSciNet  Google Scholar 

  • Huijbregts, M.: Segmentation, diarization and speech transcription: surprise data unraveled. Citeseer (2008)

  • Jain, C., Flick, P., Pan, T., Green, O., Aluru, S.: An adaptive parallel algorithm for computing connected components. IEEE Trans. Parallel Distrib. Syst. 28(9), 2428–2439 (2017)

    Article  Google Scholar 

  • Jung, J., Shin, K., Sael, L., Kang, U.: Random walk with restart on large graphs using block elimination. ACM Trans. Database Syst. (TODS) 41(2), 1–43 (2016)

    Article  MathSciNet  Google Scholar 

  • Kang, U., Faloutsos, C.: Beyond’caveman communities’: Hubs and spokes for graph com- pression and mining. In: 2011 IEEE 11th International Conference on Data Mining, pp. 300–309 (2011)

  • Kang, U., McGlohon, M., Akoglu, L., Faloutsos, C.: Patterns on the connected components of terabyte-scale graphs. In: 2010 IEEE International Conference on Data Mining, pp. 875–880 (2010)

  • Kikuchi, K., Masuda, Y., Yamashita, T., Sato, K., Katagiri, C., Hirao, T., Yaguchi, H.: A new quantitative evaluation method for age- related changes of individual pigmented spots in facial skin. Skin Res. Technol. 22(3), 318–324 (2016)

    Article  Google Scholar 

  • Liao, X.-K., Pang, Z.-B., Wang, K.-F., Lu, Y.-T., Xie, M., Xia, J., Suo, G.: High performance interconnect network for tianhe system. J. Comput. Sci. Technol. 30(2), 259–272 (2015)

    Article  Google Scholar 

  • Lim, Y., Lee, W.-J., Choi, H.-J., Kang, U.: Discovering large subsets with high quality partitions in real world graphs. In: 2015 International Conference on Big Data and Smart Computing (big-comp), pp. 186–193 (2015)

  • Lim, Y., Kang, U., Faloutsos, C.: Slashburn: Graph compression and mining beyond caveman communities. IEEE Trans. Knowl. Data Eng. 26(12), 3077–3089 (2014)

    Article  Google Scholar 

  • Lim, Y., Lee, W.-J., Choi, H.-J., Kang, U.: Mtp: discovering high quality partitions in real world graphs. World Wide Web 20(3), 491–514 (2017)

    Article  Google Scholar 

  • Low, Y., Gonzalez, J., Kyrola, A., Bickson, D., Guestrin, C., Hellerstein, J.M.: Distributed graphlab: A framework for machine learning in the cloud. Preprint at arXiv:1204.6078 (2012)

  • Lu, X., Wang, H., Wang, J.: Internet-based virtual computing environment (ivce): Concepts and architecture. Sci. China Ser. F Inf. Sci. 49(6), 681–701 (2006)

    Article  Google Scholar 

  • Lu, X., Wang, H., Wang, J., Xu, J., Li, D.: Internet-based virtual computing environment: Beyond the data center as a computer. Futur. Gener. Comput. Syst. 29(1), 309–322 (2013)

    Article  Google Scholar 

  • Medini, D., Covacci, A., Donati, C.: Protein homology network families reveal step-wise diversification of type iii and type iv secretion systems. PLoS Comput. Biol. 2(12), e173 (2006)

    Article  Google Scholar 

  • Nowosielski, A., Frejlichowski, D., Forczmański, P., Gościewska, K., Hofman, R.: Automatic analysis of vehicle trajectory applied to visual surveillance. In: Image processing and communications challenges, vol. 7, pp. 89–96. Springer (2016)

  • Patil, G.P., Acharya, R., Phoha, S.: Digital governance, hotspot detection, and homeland security. Encyclopedia of Quantitative Risk Analysis and Assessment, vol. 2 (2008)

  • Pettie, S., Ramachandran, V.: A randomized time-work optimal parallel algorithm for finding a minimum spanning forest. SIAM J. Comput. 31(6), 1879–1895 (2002)

    Article  MathSciNet  Google Scholar 

  • Reif, J.H.: Depth-first search is inherently sequential. Inf. Process. Lett. 20(5), 229–234 (1985a)

  • Reif, J. H.: Optimal parallel algorithms for interger sorting and graph connectivity. (Tech. Rep.). HARVARD UNIV CAMBRIDGE MA AIKEN COMPUTATION LAB (1985b)

  • Shiloach, Y., Vishkin, U.: An o (log n) parallel connectivity algorithm (Tech. Rep.). Computer Science Department, Technion (1980)

  • Shun, J., Dhulipala, L., Blelloch, G.: A simple and practical linear-work parallel algorithm for connectivity. In: Proceedings of the 26th ACM Symposium on Parallelism in Algorithms and Architectures, pp. 143–153 (2014)

  • Slota, G. M., Rajamanickam, S., Madduri, K.: A case study of complex graph analysis in distributed memory: Implementation and optimiza- tion. In: 2016 IEEE International Parallel and Dis- Tributed Processing Symposium (ipdps), pp. 293–302 (2016)

  • Song, W., Wu, D., Xi, Y., Park, Y.W., Cho, K.: Motion-based skin region of interest detection with a real-time connected component labeling algorithm. Multimed. Tools Appl. 76(9), 11199–11214 (2017)

    Article  Google Scholar 

  • Tarjan, R.E., Van Leeuwen, J.: Worst-case analysis of set union algorithms. Journal of the CM (JACM), 31(2), 245–281 (1984). https://investor.fb.com/investor-news/press-release-details/2021/Facebook-Reports-First-Quarter-2021-Results/default.aspx.(n.d.)https://www.tencent.com/zh-cn/investors.html.(n.d.)

  • Tarjan, R.: Depth-first search and linear graph algorithms. SIAM J. Comput. 1(2), 146–160 (1972)

    Article  MathSciNet  Google Scholar 

  • Tarjan, R.E.: Efficiency of a good but not linear set union algorithm. J. ACM (JACM) 22(2), 215–225 (1975)

    Article  MathSciNet  Google Scholar 

  • Vishkin, U.: An optimal parallel connectivity algorithm. Discret. Appl. Math. 9(2), 197–207 (1984)

    Article  MathSciNet  Google Scholar 

  • Wang, R., Lu, K., Chen, J., Zhang, W., Li, J., Yuan, Y., Fan, X.: Brief introduction of tianhe exascale prototype system. Tsinghua Sci. Technol. 26(3), 361–369 (2020)

    Article  Google Scholar 

  • Wu, X., Yuan, P., Peng, Q., Ngo, C.-W., He, J.-Y.: Detection of bird nests in overhead catenary system images for high-speed rail. Pattern Recogn. 51, 242–254 (2016)

    Article  Google Scholar 

  • Yao, A.C.: On the expected performance of path compression algorithms. SIAM J. Comput. 14(1), 129–133 (1985)

    Article  MathSciNet  Google Scholar 

  • Yip, M., Shadbolt, N., Webber, C.: Structural analysis of online criminal social networks. In: 2012 IEEE International Conference on Intelligence and Security Informatics, pp. 60–65 (2012)

  • Zhang, Y., Azad, A., Hu, Z.: Fastsv: A distributed-memory connected component algo- rithm with fast convergence. In: Proceedings of the 2020 SIAM Conference on Parallel Processing for Scientific Computing, pp. 46–57 (2020)

  • Zhang, Y., Azad, A., Buluc, A.: Parallel algorithms for finding connected components using linear algebra. J. Parallel Distrib. Comput. 144, 14–27 (2020)

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported by the National Numerical Wind Tunnel Project(NNW2019ZT6-B21), the National Key Research and Development Program of China (2018YFB0204301), the Hunan Natural Science Foundation of China(2020JJ4669), and the Foundation of Parallel and Distributed Processing Laboratory (6142110190206).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hao Bai.

Ethics declarations

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bai, H., Gan, X., Xu, T. et al. VPC: Pruning connected components using vector-based path compression for Graph500. CCF Trans. HPC 3, 271–285 (2021). https://doi.org/10.1007/s42514-021-00070-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s42514-021-00070-z

Keywords

Navigation