Loading [a11y]/accessibility-menu.js
CongraPlus: Towards Efficient Processing of Concurrent Graph Queries on NUMA Machines | IEEE Journals & Magazine | IEEE Xplore

CongraPlus: Towards Efficient Processing of Concurrent Graph Queries on NUMA Machines


Abstract:

Graph analytics has been routinely used to solve problems in a wide range of real-life applications. Efficiently processing concurrent graph analytics queries in a multiu...Show More

Abstract:

Graph analytics has been routinely used to solve problems in a wide range of real-life applications. Efficiently processing concurrent graph analytics queries in a multiuser environment is highly desirable as we enter a world of edge device oriented services. Existing research, however, primarily focuses on analyzing a single, large graph dataset and leaves the efficient processing of multiple mid-sized graph analytics queries an intriguing yet challenging open problem. In this work, we investigate the scheduling of concurrent graph analytics queries on NUMA machines. We analyze the performance of several graph analytics algorithms and observe that they have diminishing performance returns as the number of processor cores increases. With concurrent graph analytics, such diminishing returns translate to no or even negative performance gains because of increasing contention on shared hardware resources. We also demonstrate the unpredictability of memory bandwidth usage for numerous graph analytics algorithms, which can lead to sub-optimal performance due to its potential to cause severe memory bandwidth contention. Motivated by the above observations, we propose CongraPlus, a NUMA-aware scheduler that intelligently manages concurrent graph analytics queries for better system throughput and memory bandwidth efficiency. CongraPlus collects the memory bandwidth consumption characteristics of graph analytics queries via offline profiling and eliminates memory bandwidth contention by computing the optimal sequence to launch queries. It also avoids computation resource contention by assigning a certain number of processor cores to the individual queries. We implement CongraPlus in C++ on top of the Ligra graph processing framework and test it with judiciously selected graph processing query combinations. Our results reveal that CongraPlus-based schemes improve query throughput by 30 percent compared to the conventional approach. It also exhibits a much better quality of ser...
Published in: IEEE Transactions on Parallel and Distributed Systems ( Volume: 30, Issue: 9, 01 September 2019)
Page(s): 1990 - 2002
Date of Publication: 14 February 2019

ISSN Information:

Funding Agency:


References

References is not available for this document.