Addressing Global Data Dependencies in Heterogeneous Asynchronous Runtime Systems on GPUs

Peterson, Brad; Humphrey, Alan; Schmidt, John; Berzins, Martin

doi:10.1145/3152041.3152082

Title: Addressing Global Data Dependencies in Heterogeneous Asynchronous Runtime Systems on GPUs

Conference · Wed Nov 01 00:00:00 EDT 2017 · Proceedings of the 3rd International IEEE Workshop on Extreme Scale Programming Models and Middleware

DOI:https://doi.org/10.1145/3152041.3152082· OSTI ID:1582428

Peterson, Brad ^[1]; Humphrey, Alan ^[1]; Schmidt, John ^[1]; Berzins, Martin ^[1]

Univ. of Utah, Salt Lake City, UT (United States)

Large-scale parallel applications with complex global data dependencies beyond those of reductions pose significant scalability challenges in an asynchronous runtime system. Internodal challenges include identifying the all-to-all communication of data dependencies among the nodes. Intranodal challenges include gathering together these data dependencies into usable data objects while avoiding data duplication. This paper addresses these challenges within the context of a large-scale, industrial coal boiler simulation using the Uintah asynchronous many-task runtime system on GPU architectures. We show significant reduction in time spent analyzing data dependencies through refinements in our dependency search algorithm. Multiple task graphs are used to eliminate subsequent analysis when task graphs change in predictable and repeatable ways. Using a combined data store and task scheduler redesign reduces data dependency duplication ensuring that problems fit within host and GPU memory. Furthermore, these modifications did not require any changes to application code or sweeping changes to the Uintah runtime system. We report results running on the DOE Titan system on 119K CPU cores and 7.5K GPUs simultaneously. Our solutions can be generalized to other task dependency problems with global dependencies among thousands of nodes which must be processed efficiently at large scale.

View Conference

Cite

Export

Save

Research Organization:: Univ. of Utah, Salt Lake City, UT (United States); Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility (OLCF)

Sponsoring Organization:: USDOE National Nuclear Security Administration (NNSA); USDOE Office of Science (SC)

DOE Contract Number:: NA0002375; AC05-00OR22725

OSTI ID:: 1582428

Journal Information:: Proceedings of the 3rd International IEEE Workshop on Extreme Scale Programming Models and Middleware, Conference: 3.International IEEE Workshop on Extreme Scale Programming Models and Middleware (ESPM2'17), Denver, CO (United States), 12 Nov 2017

Country of Publication:: United States

Language:: English

References (11)

Dynamic task scheduling for the Uintah framework Meng, Qingyu; Luitjens, Justin; Berzins, Martin 2010 3rd Workshop on Many-Task Computing on Grids and Supercomputers (MTAGS) https://doi.org/10.1109/MTAGS.2010.5699431	conference	November 2010
Reducing overhead in the Uintah framework to support short-lived tasks on GPU-heterogeneous architectures Peterson, Brad; Dasari, Harish; Humphrey, Alan Proceedings of the 5th International Workshop on Domain-Specific Languages and High-Level Frameworks for High Performance Computing - WOLFHPC '15 https://doi.org/10.1145/2830018.2830023	conference	January 2015
Investigating applications portability with the Uintah DAG-based runtime system on PetaScale supercomputers Meng, Qingyu; Humphrey, Alan; Schmidt, John Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis on - SC '13 https://doi.org/10.1145/2503210.2503250	conference	January 2013
The cosmological simulation code gadget-2 Springel, Volker Monthly Notices of the Royal Astronomical Society, Vol. 364, Issue 4 https://doi.org/10.1111/j.1365-2966.2005.09655.x	journal	December 2005
Large Scale Parallel Solution of Incompressible Flow Problems Using Uintah and Hypre Schmidt, J.; Berzins, M.; Thornock, J. 2013 13th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid), 2013 13th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing https://doi.org/10.1109/CCGrid.2013.10	conference	May 2013
Spatial Domain-Based Parallelism in Large-Scale, Participating-Media, Radiative Transport Applications Burns, Shawn P.; Christen, Mark A. Numerical Heat Transfer, Part B: Fundamentals, Vol. 31, Issue 4 https://doi.org/10.1080/10407799708915117	journal	June 1997
Parallelization of the P-1 Radiation Model Krishnamoorthy, Gautham; Rawat, Rajesh; Smith, Philip J. Numerical Heat Transfer, Part B: Fundamentals, Vol. 49, Issue 1 https://doi.org/10.1080/10407790500344068	journal	January 2006
Efficient Methods for Handling Long-Range Forces in Particle–Particle Simulations Fangohr, Hans; Price, Andrew R.; Cox, Simon J. Journal of Computational Physics, Vol. 162, Issue 2 https://doi.org/10.1006/jcph.2000.6541	journal	August 2000
The Design and Implementation of hypre, a Library of Parallel High Performance Preconditioners Falgout, Robert D.; Jones, Jim E.; Yang, Ulrike Meier Lecture Notes in Computational Science and Engineering https://doi.org/10.1007/3-540-31619-1_8	book	January 2006
Radiative Heat Transfer Calculation on 16384 GPUs Using a Reverse Monte Carlo Ray Tracing Approach with Adaptive Mesh Refinement Humphrey, Alan; Sunderland, Daniel; Harman, Todd 2016 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) https://doi.org/10.1109/IPDPSW.2016.93	conference	May 2016
CHARM++: a portable concurrent object oriented system based on C++ Kale, Laxmikant V.; Krishnan, Sanjeev ACM SIGPLAN Notices, Vol. 28, Issue 10 https://doi.org/10.1145/167962.165874	journal	October 1993

Cited By (1)

Node failure resiliency for Uintah without checkpointing Sahasrabudhe, Damodar; Berzins, Martin; Schmidt, John Concurrency and Computation: Practice and Experience, Vol. 31, Issue 20 https://doi.org/10.1002/cpe.5340	journal	June 2019

Similar Records

An Integrated Approach to Scaling Task-Based Runtime Systems for Next Generation Engineering problems

Journal Article · Wed Nov 01 00:00:00 EDT 2017 · International Conference for High Performance Computing, Networking, Storage and Analysis · OSTI ID:1582428

Humphrey, Alan; Peterson, Brad; Schmidt, John; +9 more

Automatic Halo Management for the Uintah GPU-Heterogeneous Asynchronous Many-Task Runtime

Journal Article · Fri Dec 07 00:00:00 EST 2018 · International Journal of Parallel Programming · OSTI ID:1582428

Peterson, Brad; Humphrey, Alan; Sunderland, Dan; +4 more

Radiation modeling using the Uintah heterogeneous CPU/GPU runtime system. In: XSEDE '12 Proceedings of the 1st Conference of the Extreme Science and Engineering Discovery Environment: Bridging from the eXtreme to the campus and beyond, Article No. 4

Conference · Sun Jan 01 00:00:00 EST 2012 · OSTI ID:1582428

Humphrey, Alan; Meng, Qingyu; Berzins, Martin; +1 more

Related Subjects

97 MATHEMATICS AND COMPUTING
Data dependencies
Asynchronous Many-Task
Programming Models
Runtime Systems
Scalability
GPU
Uintah
Coal Boiler
Radiative Heat Transfer

Title: Addressing Global Data Dependencies in Heterogeneous Asynchronous Runtime Systems on GPUs

Citation Formats

References (11)

Cited By (1)

Similar Records

Related Subjects