Paradigmatic shifts for exascale supercomputing

Davis, Neal E.; Robey, Robert W.; Ferenbaugh, Charles R.; Nicholaeff, David; Trujillo, Dennis P.

doi:10.1007/s11227-012-0789-3

Paradigmatic shifts for exascale supercomputing

Published: 09 June 2012

Volume 62, pages 1023–1044, (2012)
Cite this article

The Journal of Supercomputing Aims and scope Submit manuscript

Neal E. Davis^1,2,
Robert W. Robey³,
Charles R. Ferenbaugh⁴,
David Nicholaeff^1,5 &
…
Dennis P. Trujillo^1,6

476 Accesses
7 Citations
1 Altmetric
Explore all metrics

Abstract

As the next generation of supercomputers reaches the exascale, the dominant design parameter governing performance will shift from hardware to software. Intelligent usage of memory access, vectorization, and intranode threading will become critical to the performance of scientific applications and numerical calculations on exascale supercomputers. Although challenges remain in effectively programming the heterogeneous devices likely to be utilized in future supercomputers, new languages and tools are providing a pathway for application developers to tackle this new frontier. These languages include open programming standards such as OpenCL and OpenACC, as well as widely-adopted languages such as CUDA; also of importance are high-quality libraries such as CUDPP and Thrust. This article surveys a purposely diverse set of proof-of-concept applications developed at Los Alamos National Laboratory. We find that the capability level of the accelerator computing hardware and languages has moved beyond the regular grid finite difference calculations and molecular dynamics codes. More advanced applications requiring dynamic memory allocation, such as cell-based adaptive mesh refinement, can now be addressed—and with more effort even unstructured mesh codes can be moved to the GPU.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

On the Performance Portability of Structured Grid Codes on Many-Core Computer Architectures

Barriers to Exascale Computing

Using C++ AMP to Accelerate HPC Applications on Multiple Platforms

References

Bergen BK, Daniels MG, Weber PM (2010) A hybrid programming model for compressible gas dynamics using OpenCL. In: 2010 39th international conference on parallel processing workshops. doi:10.1109/ICPPW.2010.60
Google Scholar
Bhatele A (2010) Automating topology aware mapping for supercomputers. Dissertation, University of Illinois at Urbana–Champaign
Boillat J, Burkhart H, Decker K, Kropf P (1991) Parallel computing in the 1990’s: attacking the software problem. Phys Rep 207(3–5):141–165
Article Google Scholar
Bowers KJ, Albright J, Bergen B, Yin L, Barker J, Kerbyson DJ (2008) 0.374 Pflop/s trillion-particle kinetic modeling of laser plasma interaction on Roadrunner. In: Proceedings of the 2008 ACM/IEEE conference on supercomputing, SC ’08. IEEE, Piscataway, pp 63:1–63:11
Google Scholar
Bowers KJ, Albright BJ, Yin L, Bergen B, Twan T (2008) Ultrahigh performance three-dimensional electromagnetic relativistic kinetic plasma simulation. Phys Plasmas 15(5):055703. doi:10.1063/1.2840133
Article Google Scholar
Casaglia G (1976) Distributed computing systems: a biased review. Euromicro Newsl 2(4):5–18
Article Google Scholar
Chen S, Gibbons B, Nath S (2011) Rethinking database algorithms for phase change memory. In: Proceedings of the 5th biennial conference on innovative data systems research (CIDR’11)
Google Scholar
Davis SF (1987) A simplified TVD finite difference scheme via artificial viscosity. SIAM J Sci Comput 8(1):1–18. doi:10.1137/0908002
Article MATH Google Scholar
DeVito Z, Joubert N, Palacios F, Oakley S, Medina M, Barrientos M, Elsen E, Ham F, Aiken A, Duraisamy K, Darve E, Alonso J, Hanrahan P (2011) Liszt: a domain specific language for building portable mesh-based PDE solvers. In: Proceedings of the 2011 ACM/IEEE conference on supercomputing
Google Scholar
Dongarra J (2009) An overview of HPC and challenges for the future. In: HPC Asia 2009. http://www.nchc.org.tw/en/news/index.php?NEWS_ID=49. Accessed 29 July 2011
Feng W, Cameron K (2007) The Green500 list: encouraging sustainable supercomputing. Computer 40(12):50–55
Article Google Scholar
Ferenbaugh C (in review) A comparison of GPU strategies for unstructured mesh physics. Concurr Comput Pract Exp
Gropp W, Lusk E, Doss N, Skjellum A (1996) A high-performance, portable implementation of the MPI message passing interface standard. Parallel Comput 22(6):789–828
Article MATH Google Scholar
Harvey M, Fabritiis GD (2011) Swan: a tool for porting CUDA programs to OpenCL. Comput Phys Commun 182(4):1093–1099
Article Google Scholar
Kato S, Lakshmanan K, Kumar A, Kelkar M, Ishikawa Y, Rajkumar R (2011) RGEM: a responsive GPGPU execution model for runtime engines. In: 2011 IEEE 32nd real-time systems symposium (RTSS), pp 57–66
Chapter Google Scholar
Kato S, McThrow M, Maltzahn C, Brandt S (in press) Gdev: first-class GPU resource management in the operating system. In: 2012 USENIX annual technical conference (USENIX ATC’12)
Khaleel MA (2010) 2010 exascale workshop panel report meeting. Technical report PNNL-19515, Pacific Northwest National Laboratory, Department of Energy, Washington, DC
Klimovitski A (2001) Using SSE and SSE2: misconceptions and reality. In: Intel developer update magazine, March 2001, pp 1–8
Google Scholar
Kogge P, Bergman K, Borkar S, Campbell D, Carlson W, Dally W, Denneau M, Franzon P, Harrod W, Hill K, Hiller J, Karp S, Keckler S, Klein D, Lucas R, Richards M, Scarpelli A, Scott S, Snavely A, Sterling T, Williams RS, Yelick K (2008) Exascale computing study: Technology challenges in achieving exascale systems. http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.165.6676. Accessed 23 March 2012
Los Alamos National Laboratory (2011) Flag 3.2 alpha 5 radiation–hydrodynamics code (LA-CC 11-065)
Message Passing Interface Forum (1994) MPI: a message-passing interface standard. Int J Supercomput Appl High Perform Comput 8(3–4):159–416
Google Scholar
Mills A, Wood L (1981) Cray-1: a powerful delivery system for engineering software. Adv Eng Softw 3(2):62–66
Article Google Scholar
Nicholaeff D, Davis N, Trujillo D Robey, R (in review) A cell-based adaptive mesh refinement implemented with general-purpose graphics processing units. SIAM J Sci Comput
Oyanagi Y (2002) Future of supercomputing. J Comput Appl Math 149(1):147–153
Article MathSciNet MATH Google Scholar
Pao K (2011) Co-design and you: why should mathematicians care about exascale computing. In: 2011 DOE applied mathematics program meeting
Google Scholar
Papadrakakis M, Stavroulakis G, Karatarakis A (2011) A new era in scientific computing: domain decomposition methods in hybrid CPU–GPU architectures. Comput Methods Appl Mech Eng 200(13–16):1490–1508
Article MathSciNet MATH Google Scholar
Robey RN, Nicholaeff D Robey, RW (in review) Hash-based algorithms for discretized data. SIAM J Sci Comput
Simon H, Zacharia T, Stevens R (2007) Modeling and simulation at the exascale for energy and the environment. Technical report, Department of Energy, Washington, DC
Snir M, Gropp W, Kogge P (2011) Exascale research: preparing for the post-Moore era. http://hdl.handle.net/2142/25469. Accessed 25 July 2011
Sottile M, Rasmussen C, Weseloh W, Robey R, Quinlan D, Overbey, J (in press) ForOpenCL: transformations exploiting array syntax in Fortran for accelerator programming. Int J Comp Sci Eng
Sottile MJ, Rasmussen CE, Weseloh WN, Robey RW, Quinlan J, Overbey J (2011) ForOpenCL: transformations exploiting array syntax in fortran for accelerator programming. CoRR abs/1107.2157
Tendler J, Dodson JS, Fields S, Le H, Sinharoy B (2002) POWER4 system microarchitecture. IBM J Res Dev 46(1):5–25
Article Google Scholar
Wolfe M (2008) How we should program GPGPUs. Linux Journal, November 2008. http://www.linuxjournal.com/magazine/how-we-should-program-gpgpus. Accessed 29 July 2011
Yang XJ, Liao XK, Lu K, Hu QF, Song JQ, Su JS (2011) The TianHe-1A supercomputer: its hardware and software. J Comput Sci Technol 26(3):344–351
Article Google Scholar
Young J (2011) Supercomputers let up on speed. The Chronicle of Higher Education, April 2011. http://chronicle.com/article/In-University-Supercomputing/126979/. Accessed 20 July 2011
Zhang C, Yuan X, Srinivasan A (2010) Processor affinity and MPI performance on SMP–CMP clusters. In: 2010 IEEE international symposium on parallel distributed processing, workshops and PhD forum (IPDPSW), pp 1–8. doi:10.1109/IPDPSW.2010.5470774
Chapter Google Scholar

Download references

Acknowledgements

The authors would like to thank Ben Bergen and Marcus Daniels for the use of Darwin, the LANL CCS GPU cluster.

The authors are also grateful to the LANL CCS/X-Division Exascale working group led by Tim Kelley and to Scott Runnels for organizing the LANL X-Division Summer Workshop exascale group at which much of the foundational work for this article was performed. These groups encouraged the work on the applications in different computational domains.

This work was supported by Los Alamos National Laboratory. Los Alamos National Laboratory is operated by Los Alamos National Security, LLC, for the National Nuclear Security Administration of the US Department of Energy under contract DE-AC52-06NA25396.

Author information

Authors and Affiliations

XCP-4 Methods & Algorithms, Los Alamos National Laboratory, Los Alamos, NM, USA
Neal E. Davis, David Nicholaeff & Dennis P. Trujillo
Department of Nuclear, Plasma, & Radiological Engineering, University of Illinois at Urbana–Champaign, Urbana, IL, USA
Neal E. Davis
XCP-2 Eulerian Applications, Los Alamos National Laboratory, Los Alamos, NM, USA
Robert W. Robey
HPC-1 Scientific Software Engineering, Los Alamos National Laboratory, Los Alamos, NM, USA
Charles R. Ferenbaugh
Department of Physics & Astronomy, University of California at Los Angeles, Los Angeles, CA, USA
David Nicholaeff
Department of Physics, New Mexico State University, Las Cruces, NM, USA
Dennis P. Trujillo

Authors

Neal E. Davis
View author publications
You can also search for this author in PubMed Google Scholar
Robert W. Robey
View author publications
You can also search for this author in PubMed Google Scholar
Charles R. Ferenbaugh
View author publications
You can also search for this author in PubMed Google Scholar
David Nicholaeff
View author publications
You can also search for this author in PubMed Google Scholar
Dennis P. Trujillo
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Neal E. Davis.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Davis, N.E., Robey, R.W., Ferenbaugh, C.R. et al. Paradigmatic shifts for exascale supercomputing. J Supercomput 62, 1023–1044 (2012). https://doi.org/10.1007/s11227-012-0789-3

Download citation

Published: 09 June 2012
Issue Date: November 2012
DOI: https://doi.org/10.1007/s11227-012-0789-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Paradigmatic shifts for exascale supercomputing

Abstract

Access this article

Similar content being viewed by others

On the Performance Portability of Structured Grid Codes on Many-Core Computer Architectures

Barriers to Exascale Computing

Using C++ AMP to Accelerate HPC Applications on Multiple Platforms

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Paradigmatic shifts for exascale supercomputing

Abstract

Access this article

Similar content being viewed by others

On the Performance Portability of Structured Grid Codes on Many-Core Computer Architectures

Barriers to Exascale Computing

Using C++ AMP to Accelerate HPC Applications on Multiple Platforms

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation