Abstract
Graph algorithms are hard to parallelize, as they exhibit varying degrees of parallelism and perform irregular memory accesses. Graph coloring is a well studied problem, that colors the vertices of a graph, such that no adjacent vertices have the same color. This is a necessity for a large number of applications that require a coloring with few colors in near-linear time. In this work, we propose a simple and fast parallel graph coloring algorithm, well suited for shared memory architectures. Our algorithm employs Hardware Transactional Memory (HTM) to detect coloring inconsistencies between adjacent vertices, and exploits Read-Copy-Update (RCU) to enable high performance and ensure correctness.
We evaluate our algorithm on an Intel Haswell server using large-scale synthetic and real-world graphs, chosen to vary in terms of density and structure. With 14 threads, we achieved a geometric-mean speedup of 4.35 and a maximum speedup of 11.44.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Welsh, D.J.A., Powell, M.B.: An upper bound for the chromatic number of a graph and its application to timetabling problems. Comput. J. 10, 85–86 (1967)
Marx, D.: Graph coloring problems and their applications in scheduling. In: Proceedings of John Von Neumann PhD Students Conference, pp. 1–2 (2004)
Chaitin, G.J., Auslander, M.A., Chandra, A.K., Cocke, J., Hopkins, M.E., Markstein, P.W.: Register allocation via coloring. Comput. Lang. 6, 47–57 (1981)
Coleman, T.F., Moré, J.J.: Estimation of sparse Jacobian matrices and graph coloring problems. SIAM J. Numer. Anal. 20, 187–209 (1983)
Saad, Y.: Sparskit: a basic tool kit for sparse matrix computations (1994)
Kaler, T., Hasenplaugh, W., Schardl, T.B., Leiserson, C.E.: Executing dynamic data-graph computations deterministically using chromatic scheduling. In: Proceedings of the 26th ACM Symposium on Parallelism in Algorithms and Architectures, SPAA 2014, pp. 154–165 (2014)
Garey, M.R., Johnson, D.S., Stockmeyer, L.: Some simplified NP-complete graph problems. Theor. Comput. Sci. 1, 237–267 (1976)
Gebremedhin, A.H., Manne, F.: Scalable parallel graph coloring algorithms. Pract. Exp. Concurr. 12, 1131–1146 (2000)
Çatalyürek, Ü.V., Feo, J., Gebremedhin, A.H., Halappanavar, M., Pothen, A.: Graph coloring algorithms for muti-core and massively multithreaded architectures. CoRR (2012)
Boman, E.G., Bozdağ, D., Catalyurek, U., Gebremedhin, A.H., Manne, F.: A scalable parallel graph coloring algorithm for distributed memory computers. In: Cunha, J.C., Medeiros, P.D. (eds.) Euro-Par 2005. LNCS, vol. 3648, pp. 241–251. Springer, Heidelberg (2005). https://doi.org/10.1007/11549468_29
McKenney, P.E., Slingwine, J.D.: Read-copy update: using execution history to solve concurrency problems (1998)
Yoo, R.M., Hughes, C.J., Lai, K., Rajwar, R.: Performance evaluation of Intel transactional synchronization extensions for high-performance computing. In: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, SC 2013 (2013)
Cain, H.W., Michael, M.M., Frey, B., May, C., Williams, D., Le, H.: Robust architectural support for transactional memory in the power architecture. SIGARCH Comput. Archit. News 41, 225–236 (2013)
Arbel, M., Attiya, H.: Concurrent updates with RCU: search tree as an example. In: Proceedings of the 2014 ACM Symposium on Principles of Distributed Computing, PODC 2014 (2014)
Matveev, A., Shavit, N., Felber, P., Marlier, P.: Read-log-update: a lightweight synchronization mechanism for concurrent programming. In: Proceedings of the 25th Symposium on Operating Systems Principles, SOSP 2015 (2015)
Siakavaras, D., Nikas, K., Goumas, G.I., Koziris, N.: RCU-HTM: combining RCU with HTM to implement highly efficient concurrent binary search trees. In: 26th International Conference on Parallel Architectures and Compilation Techniques, PACT 2017 (2017)
Deveci, M., Boman, E., Devine, K.D., Rajamanickam, S.: Parallel graph coloring for manycore architectures. In: IPDPS 2016 (2016)
Leskovec, J., Krevl, A.: SNAP datasets: Stanford large network dataset collection, June 2014. http://snap.stanford.edu/data
Kunegis, J.: KONECT: the Koblenz network collection (2013)
Demetrescu, C., Goldberg, A., Johnson,, D.: 9th DIMACS implementation challenge - shortest paths (2006). http://www.dis.uniroma1.it/challenge9/
Brown, T., Kogan, A., Lev, Y., Luchangco, V.: Investigating the performance of hardware transactions on a multi-socket machine. In: Proceedings of the 28th ACM Symposium on Parallelism in Algorithms and Architectures, SPAA 2016 (2016)
Rokos, G., Gorman, G., Kelly, P.H.J.: A fast and scalable graph coloring algorithm for multi-core and many-core architectures. In: Träff, J.L., Hunold, S., Versaci, F. (eds.) Euro-Par 2015. LNCS, vol. 9233, pp. 414–425. Springer, Heidelberg (2015). https://doi.org/10.1007/978-3-662-48096-0_32
Jones, M.T., Plassmann, P.: A parallel graph coloring heuristic. SIAM J. Sci. Comput. 14, 654–669 (1993)
Hasenplaugh, W., Kaler, T., Schardl, T.B., Leiserson, C.E.: Ordering heuristics for parallel graph coloring. In: Proceedings of the 26th ACM Symposium on Parallelism in Algorithms and Architectures, SPAA 2014 (2014)
Nikas, K., Anastopoulos, N., Goumas, G.I., Koziris, N.: Employing transactional memory and helper threads to speedup Dijkstra’s algorithm. In: International Conference on Parallel Processing, ICPP 2009, pp. 388–395 (2009)
Kang, S., Bader, D.A.: An efficient transactional memory algorithm for computing minimum spanning forest of sparse graphs. In: Proceedings of the 14th ACM Symposium on Principles and Practice of Parallel Programming, PPoPP 2009 (2009)
Clements, A.T., Kaashoek, M.F., Zeldovich, N.: Scalable address spaces using RCU balanced trees. In: Proceedings of the Seventeenth International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS XVII, pp. 199–210 (2012)
Acknowledgments
Christina Giannoula is funded by PhD fellowship from the General Secretariat for Research and Technology (GSRT) and the Hellenic Foundation for Research and Innovation (HFRI). The authors thank their anonymous reviewers and their colleagues Nikela Papadopoulou, Konstantinos Nikas, Vasileios Karakostas and Dimitrios Siakavaras for their insightful comments and valuable feedback.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Giannoula, C., Goumas, G., Koziris, N. (2018). Combining HTM with RCU to Speed Up Graph Coloring on Multicore Platforms. In: Yokota, R., Weiland, M., Keyes, D., Trinitis, C. (eds) High Performance Computing. ISC High Performance 2018. Lecture Notes in Computer Science(), vol 10876. Springer, Cham. https://doi.org/10.1007/978-3-319-92040-5_18
Download citation
DOI: https://doi.org/10.1007/978-3-319-92040-5_18
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-92039-9
Online ISBN: 978-3-319-92040-5
eBook Packages: Computer ScienceComputer Science (R0)