Abstract
Parallel Breadth First Search (BFS) is a representative algorithm in Graph 500, the well-known benchmark for evaluating supercomputers for data-intensive applications. However, the specific storage model of Graph 500 brings severe challenge to efficient communication when computing parallel BFS in large-scale graphs. In this paper, we propose an effective method PruX for optimizing the communication of parallel BFS in two aspects. First, we adopt a scalable structure to record the access information of the vertices on each machine. Second, we prune unnecessary inter-machine communication for previously accessed vertices by checking the records. Evaluation results show that the performance of our method is at least six times higher than that of the original implementation of parallel BFS.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
The PruX and direction optimization are all optimized by modifying the algorithm execution mode to implement the parallel BFS algorithm. So we choose direction optimization as a contrast.
- 2.
We only implement the direction optimization at the algorithm level, and do not optimize its storage and computation, which results in breakdown when SCALE is too large.
- 3.
Because there are still a lot of isolated vertices in the graph, the direction optimization will compute these vertices in the bottom-up BFS algorithm, which will bring a lot of computation cost and lead to the performance degradation of large-scale graphs.
References
Agarwal, V., Petrini, F., Pasetto, D., Bader, D.A.: Scalable graph exploration on multicore processors. In: High Performance Computing, Networking, Storage and Analysis, pp. 1–11 (2010)
Ajwani, D., Meyer, U., Osipov, V.:. Improved external memory BFS implementation. In: The Workshop on Algorithm Engineering & Experiments (2007)
Akkary, H., Driscoll, M.A.: A dynamic multithreading processor. In: 1998 Proceedings of ACM/IEEE International Symposium on Microarchitecture, Micro-31, pp. 226–236 (1998)
Awerbuch, B., Gallager, R.: A new distributed algorithm to find breadth first search trees. IEEE Trans. Inf. Theory 33(3), 315–322 (2003)
Bader, D.A., Madduri, K.: Designing multithreaded algorithms for breadth-first search and st-connectivity on the Cray MTA-2, vol. 34, no. 2, pp. 523–530 (2006)
Beamer, S., Patterson, D.: Direction-optimizing breadth-first search. In: International Conference on High Performance Computing, Networking, Storage and Analysis, p. 12 (2012)
Bidstrup, S.M., Grady, C.P.L.: SSSP: simulation of single-sludge processes. Journal 60(3), 351–361 (1988)
Bulu, A.: Parallel breadth-first search on distributed memory systems, pp. 1–12 (2011)
Checconi, F., Petrini, F.: Traversing trillions of edges in real time: graph exploration on large-scale parallel machines. In: IEEE International Parallel and Distributed Processing Symposium, pp. 425–434 (2014)
Chow, E., Henderson, K., Yoo, A.: Distributed breadth-first search with 2-D partitioning. Lawrence Livermore National Laboratory (2005)
Dongarra, J., et al.: Special issue - MPI - a message passing interface standard. Int. J. Supercomput. Appl. High Perform. Comput. 8, 165 (1994)
Duran, A., Klemm, M.: The Intel® many integrated core architecture. In: International Conference on High Performance Computing and Simulation, pp. 365–366 (2012)
Greathouse, J.L., Daga, M.: Efficient sparse matrix-vector multiplication on GPUs using the CSR storage format. In: High Performance Computing, Networking, Storage, pp. 769–780 (2015)
Jose, J., Potluri, S., Tomko, K., Panda, D.K.: Designing scalable graph500 benchmark with hybrid MPI+ OpenSHMEM programming models (2013)
Leiserson, C.E., Schardl, T.B.: A work-efficient parallel breadth-first search algorithm (or how to cope with the nondeterminism of reducers). In: SPAA 2010: Proceedings of the ACM Symposium on Parallelism in Algorithms and Architectures, Thira, Santorini, Greece, June, pp. 303–314 (2010)
Lu, H., Tan, G., Chen, M., Sun, N.: Reducing communication in parallel breadth-first search on distributed memory systems, pp. 1261–1268 (2015)
Lumsdaine, A., Gregor, D., Hendrickson, B., Berry, J.: Challenges in parallel graph processing. Parallel Process. Lett. 17(01), 5–20 (2007)
Luo, L., Wong, M., Hwu, W.M.: An effective GPU implementation of breadth-first search. In: Design Automation Conference, pp. 52–55 (2010)
Malewicz, G., et al.: Pregel: a system for large-scale graph processing. In: ACM SIGMOD International Conference on Management of Data, pp. 135–146 (2010)
Sallinen, S., Gharaibeh, A., Ripeanu, M.: Accelerating direction-optimized breadth first search on hybrid architectures. In: Hunold, S., et al. (eds.) Euro-Par 2015. LNCS, vol. 9523, pp. 233–245. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-27308-2_20
Snir, M.: MPI : The Complete Reference, pp. 4038–4040 (2010)
Su, B.Y., Brutch, T.G., Keutzer, K.: Parallel BFS graph traversal on images using structured grid, pp. 4489–4492 (2010)
Yoo, A., Chow, E., Henderson, K., Mclendon, W., Hendrickson, B., Catalyurek, U.: A scalable distributed parallel breadth-first search algorithm on BlueGene/L. In: Proceedings of the ACM/IEEE SC 2005 Conference on Supercomputing, p. 25 (2005)
Acknowledgment
This work is sponsored in part by the National Basic Research Program of China (793) under Grant No. 2014CB340303 and by National Natural Science Foundation of China (NSFC) under Grant No. 61772541.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Jia, M., Zhang, Y., Li, D., Mei, S. (2018). PruX: Communication Pruning of Parallel BFS in the Graph 500 Benchmark. In: Vaidya, J., Li, J. (eds) Algorithms and Architectures for Parallel Processing. ICA3PP 2018. Lecture Notes in Computer Science(), vol 11334. Springer, Cham. https://doi.org/10.1007/978-3-030-05051-1_8
Download citation
DOI: https://doi.org/10.1007/978-3-030-05051-1_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-05050-4
Online ISBN: 978-3-030-05051-1
eBook Packages: Computer ScienceComputer Science (R0)