Abstract
Graph Analytics is important in different domains: social networks, computer networks, and computational biology to name a few. This paper describes the challenges involved in programming the underlying graph algorithms for graph analytics for distributed systems with CPU, GPU, and multi-GPU machines and how to deal with them. It emphasizes how language abstractions and good compilation can ease programming graph analytics on such platforms without sacrificing implementation efficiency.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Besta, M., Podstawski, M., Groner, L., Solomonik, E., Hoefler, T.: To push or to pull: on reducing communication and synchronization in graph computations. In: Proceedings of the 26th International Symposium on High-Performance Parallel and Distributed Computing, HPDC 2017, pp. 93–104. ACM, New York, NY, USA (2017). https://doi.org/10.1145/3078597.3078616. https://doi.acm.org/10.1145/3078597.3078616
Burtscher, M., Nasre, R., Pingali, K.: A quantitative study of irregular programs on GPUs. In: IEEE International Symposium on Workload Characterization (IISWC), pp. 141–151 (2012)
Cheramangalath, U., Nasre, R., Srikant, Y.N.: Falcon: a graph manipulation language for heterogeneous systems. ACM Trans. Archit. Code Optim. 12(4), 54:1–54:27 (2015). https://doi.org/10.1145/2842618. http://doi.acm.org/10.1145/2842618
Cheramangalath, U., Nasre, R., Srikant, Y.N.: DH-Falcon: a language for large-scale graph processing on distributed heterogeneous systems. In: IEEE International Conference on Cluster Computing. IEEE (2017)
Ching, A., Edunov, S., Kabiljo, M., Logothetis, D., Muthukrishnan, S.: One trillion edges: graph processing at facebook-scale. In: Proceedings of the VLDB Endowment, pp. 1804–1815 (2015)
Cook, S.: CUDA Programming: A Developer’s Guide to Parallel Computing with GPUs, 1st edn. Morgan Kaufmann Publishers Inc., San Francisco (2013)
Dagum, L., Menon, R.: OpenMP: an industry-standard API for shared-memory programming. IEEE Comput. Sci. Eng. 5(1), 46–55 (1998). https://doi.org/10.1109/99.660313
Dathathri, R., et al.: Gluon: a communication-optimizing substrate for distributed heterogeneous graph analytics. In: Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pp. 752–768 (2018)
Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2008). https://doi.org/10.1145/1327452.1327492. http://doi.acm.org/10.1145/1327452.1327492
Forum, M.P.: MPI: a message-passing interface standard. Technical report, Knoxville, TN, USA (1994)
Gharaibeh, A., Beltrão Costa, L., Santos-Neto, E., Ripeanu, M.: A yoke of oxen and a thousand chickens for heavy lifting graph processing. In: Proceedings of the 21st International Conference on Parallel Architectures and Compilation Techniques, PACT 2012, pp. 345–354 (2012)
Gonzalez, J.E., Low, Y., Gu, H., Bickson, D., Guestrin, C.: PowerGraph: distributed graph-parallel computation on natural graphs. In: 10th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2012, pp. 17–30 (2012)
Hong, S., Chafi, H., Sedlar, E., Olukotun, K.: Green-Marl: A DSL for easy and efficient graph analysis. In: Proceedings of the Seventeenth International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS XVII, pp. 349–362 (2012)
Low, Y., Bickson, D., Gonzalez, G.J., Guestrin, C., Kyrola, A., Hellerstein, J.M.: GraphLab: a new parallel framework for machine learning. In: Conference on Uncertainty in Artificial Intelligence (UAI) (2010)
Low, Y., Bickson, D., Gonzalez, J., Guestrin, C., Kyrola, A., Hellerstein, J.M.: Distributed GraphLab: a framework for machine learning and data mining in the cloud. In: Proceedings of the VLDB Endowment, pp. 716–727 (2012)
Malewicz, G., et al.: Pregel: a system for large-scale graph processing. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 135–146 (2010)
Meyer, U., Sanders, P.: Delta-stepping: a parallel single source shortest path algorithm. In: Proceedings of the 6th Annual European Symposium on Algorithms, ESA 1998, pp. 393–404. Springer, London (1998). http://dl.acm.org/citation.cfm?id=647908.740136
Pai, S., Pingali, K.: A compiler for throughput optimization of graph algorithms on GPUs. In: Proceedings of the 2016 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications, OOPSLA 2016, pp. 1–19 (2016)
Pan, Y., Pearce, R., Owens, J.D.: Scalable breadth-first search on a GPU cluster. In: 2018 IEEE International Parallel and Distributed Processing Symposium (IPDPS), pp. 1090–1101 (2018). https://doi.org/10.1109/IPDPS.2018.00118
Pingali, K., et al.: The tao of parallelism in algorithms. SIGPLAN Not. 46(6), 12–25 (2011). https://doi.org/10.1145/1993316.1993501. http://doi.acm.org/10.1145/1993316.1993501
Prountzos, D., Manevich, R., Pingali, K.: Elixir: a system for synthesizing concurrent graph programs. In: Proceedings of the ACM International Conference on Object Oriented Programming Systems Languages and Applications, OOPSLA 2012, pp. 375–394 (2012)
Rahimian, F., Payberah, A.H., Girdzijauskas, S., Haridi, S.: Distributed vertex-cut partitioning. In: Magoutis, K., Pietzuch, P. (eds.) DAIS 2014. LNCS, vol. 8460, pp. 186–200. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-662-43352-2_15
Upadhyay, N., Patel, P., Cheramangalath, U., Srikant, Y.N.: Large scale graph processing in a distributed environment. In: Heras, D.B., Bougé, L. (eds.) Euro-Par 2017. LNCS, vol. 10659, pp. 465–477. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-75178-8_38
Valiant, L.G.: A bridging model for parallel computation. Commun. ACM 33(8), 103–111 (1990)
Whang, J.J., Lenharth, A., Dhillon, I.S., Pingali, K.: Scalable data-driven PageRank: algorithms, system issues, and lessons learned. In: Träff, J.L., Hunold, S., Versaci, F. (eds.) Euro-Par 2015. LNCS, vol. 9233, pp. 438–450. Springer, Heidelberg (2015). https://doi.org/10.1007/978-3-662-48096-0_34
White, T.: Hadoop: The Definitive Guide, 1st edn. O’Reilly Media Inc., Sebastopol (2009)
Wikipedia contributors: Apache hadoop – Wikipedia, the free encyclopedia. https://en.wikipedia.org/w/index.php?title=Apache_Hadoop&oldid=918989758 (2019). Accessed 3 Oct 2019
Wikipedia contributors: PageRank – Wikipedia, the free encyclopedia. https://en.wikipedia.org/w/index.php?title=PageRank&oldid=907975070 (2019). Accessed 11 Aug 2019
Xia, Y., Prasanna, V.K.: Topologically adaptive parallel breadth-first search on multicore processors. In: Proceedings of 21st International Conference on Parallel and Distributed Computing Systems, PDCS 2009 (2009)
Acknowledgements
This survey paper was inspired by the ongoing collaborative book project co-authored by Dr. Unnikrishnan Cheramangalath of IIT Palakkad, Dr. Rupesh Nasre of IIT Madras, and the author of this paper. The author wishes to acknowledge their assistance in writing this paper.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Srikant, Y.N. (2020). Distributed Graph Analytics. In: Hung, D., D´Souza, M. (eds) Distributed Computing and Internet Technology. ICDCIT 2020. Lecture Notes in Computer Science(), vol 11969. Springer, Cham. https://doi.org/10.1007/978-3-030-36987-3_1
Download citation
DOI: https://doi.org/10.1007/978-3-030-36987-3_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-36986-6
Online ISBN: 978-3-030-36987-3
eBook Packages: Computer ScienceComputer Science (R0)