Abstract
Distance and centrality computations are important building blocks for modern graph databases as well as for dedicated graph analytics systems. Two commonly used centrality metrics are the compute-intense closeness and betweenness centralities, which require numerous expensive shortest distance calculations. We propose batched algorithm execution to run multiple distance and centrality computations at the same time and let them share common graph and data accesses. Batched execution amortizes the high cost of random memory accesses and presents new vectorization potential on modern CPUs and compute accelerators. We show how batched algorithm execution can be leveraged to significantly improve the performance of distance, closeness, and betweenness centrality calculations on unweighted and weighted graphs. Our evaluation demonstrates that batched execution can improve the runtime of these common metrics by over an order of magnitude.
Similar content being viewed by others
Notes
This paper is an extended version of the previously published [18].
References
Akiba T, Iwata Y, Yoshida Y (2013) Fast exact shortest-path distance queries on large networks by pruned landmark labeling. In: Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data. ACM, New York, pp 349–360
Bader DA, Kintali S, Madduri K, Mihail M (2007) Approximating betweenness centrality. In: International Workshop on Algorithms and Models for the Web-Graph. Springer, Berlin Heidelberg, pp 124–137
Bellman R (1958) On a routing problem. Q Appl Math 87–90. doi:10.1090/qam/102435
Brandes U (2001) A faster algorithm for betweenness centrality. J Math Sociol 25:163–177
Brin S, Page L (1998) The Anatomy of a Large-Scale Hypertextual Web Search Engine. In: Seventh International World-Wide Web Conference (WWW 1998), pp 3825–3833
Eppstein D, Wang J (2001) Fast approximation of centrality. In: Proceedings of the twelfth annual ACM-SIAM symposium on Discrete algorithms. Society for Industrial and Applied Mathematics, Philadelphia, pp 228–229
Fredman ML, Tarjan RE (1987) Fibonacci heaps and their uses in improved network optimization algorithms. J ACM 34(3):596–615. doi:10.1145/28869.28874
Freeman LC (1978) Centrality in social networks conceptual clarification. Soc Networks 1(3):215–239
Hong S, Depner S, Manhardt T, Van Der Lugt J, Verstraaten M, Chafi H (2015) Pgx. d: a fast distributed graph processing engine. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. ACM, Austin, p 58
Iosup A, Hegeman T, Ngai WL, Heldens S, Prat A, Manhardt T, Chafi H, Capota M, Sundaram N, Anderson M et al (2016) Ldbc graphalytics: A benchmark for large-scale graph analysis on parallel and distributed platforms. Proc VLDB Endow 9(12):1317–1328
Kaufmann M, Then M, Kemper A, Neumann T (2017) Parallel array-based single- and multi-source breadth first searches on large dense graphs. In: Proceedings of the 20th International Conference on Extending Database Technology, EDBT Venice, 201721.3.2017. doi:10.5441/002/edbt.2017.02
Klein PN (2005) Multiple-source shortest paths in planar graphs. In: SODA ’05 Proceedings of the sixteenth annual ACM-SIAM symposium on Discrete algorithms, vol 5. ACM, Vancouver, pp 146–155
Kunegis J (2013) Konect: The Koblenz network collection. In: Proceedings of the 22nd international conference on World Wide Web companion. International World Wide Web Conferences Steering Committee, Rio de Janeiro, pp 1343–1350
Madduri K, Ediger D, Jiang K, Bader DA, Chavarria-Miranda D (2009) A faster parallel algorithm and efficient multithreaded implementations for evaluating betweenness centrality on massive datasets. In: Parallel & Distributed Processing IEEE International Symposium. IPDPS 2009. IEEE, Rome, pp 1–8
Malewicz G, Austern MH, Bik AJ, Dehnert JC, Horn I, Leiser N, Czajkowski G (2010) Pregel: A system for large-scale graph processing. In: Proceedings of the 2010 ACM SIGMOD International Conference on Management of data. ACM, Indianapolis, pp 135–146
Ni C, Sugimoto C, Jiang J (2011) Degree, closeness, and betweenness: Application of group centrality measurements to explore macro-disciplinary evolution diachronically. In: Proceedings of ISSI 2011: The 13th Conference of the International Society for Scientometrics and Informetrics. Durban, pp 1–13
Olsen PW, Labouseur AG, Hwang JH (2014) Efficient top-k closeness centrality search. In: 2014 IEEE 30th International Conference on Data Engineering, pp 196–207
Then M, Günnemann S, Kemper A, Neumann T (2017) Efficient batched distance and centrality computation in unweighted and weighted graphs. In: Datenbanksysteme für Business, Technologie und Web (BTW 2017) Stuttgart, 201706.3.2017. 17. Fachtagung des GI-Fachbereichs „Datenbanken und Informationssysteme“ (DBIS). GI, Stuttgart, pp 247–266
Then M, Kaufmann M, Chirigati F, Hoang-Vu TA, Pham K, Kemper A, Neumann T, Vo HT (2014) The more the merrier: efficient multi-source graph traversal. Proc VLDB Endow 8(4):449–460. doi:10.14778/2735496.2735507
Then M, Kersten T, Günnemann S, Kemper A, Neumann T (2017) Automatic algorithm transformation for efficient multi-snapshot analytics on temporal graphs. Proc VLDB Endow 10(8):877–888
Thorup M (2004) Integer priority queues with decrease key in constant time and the single source shortest paths problem. J Comput Syst Sci 69(3):330–353. doi:10.1016/j.jcss.2004.04.003
Yen JY (1970) An algorithm for finding shortest routes from all source nodes to a given destination in general networks. Q Appl Math 526–530. doi:10.1090/qam/253822
Acknowledgements
This research was supported by the German Research Foundation (DFG), Emmy Noether grant GU 1409/2-1, and by the Technical University of Munich - Institute for Advanced Study, funded by the German Excellence Initiative and the European Union Seventh Framework Programme under grant agreement no 291763, co-funded by the European Union. Manuel Then is a recipient of the Oracle External Research Fellowship. Part of this work was conducted during an internship at Oracle Labs.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Then, M., Günnemann, S., Kemper, A. et al. Efficient Batched Distance, Closeness and Betweenness Centrality Computation in Unweighted and Weighted Graphs. Datenbank Spektrum 17, 169–182 (2017). https://doi.org/10.1007/s13222-017-0261-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13222-017-0261-x