Skip to main content

Auto-tuning an OpenACC Accelerated Version of Nek5000

  • Conference paper
  • First Online:
Solving Software Challenges for Exascale (EASC 2014)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 8759))

Included in the following conference series:

Abstract

Accelerators and, in particular, Graphics Processing Units (GPUs) have emerged as promising computing technologies which may be suitable for the future Exascale systems. However, the complexity of their architectures and the impenetrable structure of some large applications makes the hand-tuning algorithms process more challenging and unproductive. On the contrary, auto-tuning technology has appeared as a solution to this problems since it can address the inherent complexity of the latest and future computer architectures. By auto-tuning, an application may be optimised for a target platform by making automated optimal choices. To exploit this technology on modern GPUs, we have created an auto-tuned version of Nek5000 based on OpenACC directives which has demonstrated to obtained improved results over a hand-tune optimised version of the same computation kernels. This paper focuses on a particular role for auto-tuning Nek5000 to utilise a massively parallel GPU accelerated system based on OpenACC directive to adapt the Nek5000 code for the Exascale computation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Egri, G., Fodor, Z., Hoelbling, C., Katz, S., Nogradi, D., Szabo, K.: Lattice QCD as a video game. Comput. Phys. Commun. 177, 631–639 (2007)

    Article  Google Scholar 

  2. Yasuda, K.: J. Comput. Chem. 29, 334 (2007)

    Article  Google Scholar 

  3. Fung, W.W.L., Aamodt, T.M.: Energy efficient GPU transactional memory via space-time optimizations. ACM, MICRO-46, pp. 408–420 (2013)

    Google Scholar 

  4. Nivia Tesla architecture (2014). http://www.nvidia.com/object/tesla-supercomputing-solutions.html. Accesed 14 January 2014

  5. The CUDA Toolkit (2014). https://developer.nvidia.com/cuda-downloads. Accesed 14 January 2014

  6. Coleman, D.M., Feldman, D.R.: Porting existing radiation code for GPU acceleration. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 6(6), 1–6 (2013)

    Article  Google Scholar 

  7. Delgado, J., Gazolla, J., Clua, E., Masoud Sadjadi, S.: A case study on porting scientific applications to GPU/CUDA. J. Comput. Interdisc. Sci. 2(1), 3–11 (2011)

    Google Scholar 

  8. OpenMP 4.0 (2014). http://openmp.org/wp/. Accessed 14 January 2014

  9. OpenACC. OpenACC Home Page (2014). http://openacc.org/. Accessed 14 January 2014

  10. Hoshino, T., Maruyama, N., Matsuoka, S., Takaki, R.: CUDA vs OpenACC: performance case studies with kernel benchmarks and a memory-bound CFD application. In: IEEE International Symposium on Cluster Computing and the Grid, pp. 136–143 (2013)

    Google Scholar 

  11. Gray, A., Hart, A., Richardson, A., Stratford, K.: Lattice boltzmann for large-scale gpu systems. In: PARCO, pp. 167–174 (2011)

    Google Scholar 

  12. Chen, J.H., Choudhary, A., De Supinski, B., DeVries, M., Hawkes, E., Klasky, S., Liao, W., Ma, K., Mellor-Crummey, J., Podhorszki, N., et al.: Terascale direct numerical simulations of turbulent combustion using s3d. Comput. Sci. Discov. 2, 1 (2009)

    Article  Google Scholar 

  13. Fischer, P., Heisey, K., Kruse, J., Mullen, J., Tufo, H., Lottes, J.: Nek5000 Premier (2014). http://www.csc.cs.colorado.edu/voran/nek/nekdoc/primer.pdf. Accessed 10 January 2014

  14. Fischer, P., Heisey, K.: NEKBONE: Thermal Hydraulics mini-application. Nekbone Release 2.1 (2013). https://cesar.mcs.anl.gov/content/software/thermal_hydraulics. Accessed 10 January 2014

  15. Markidis, S., Gong, J., Schliephake, M., Laure E., Hart, A., Henty, D., Heisey, P., Fischer, P.: OpenACC Acceleration of Nek5000, Spectral Element Code

    Google Scholar 

  16. Shin, J., Hall, M.W., Chame, J., Chen, C., Fischer, P.F., Hovland, P.D.: Speeding up Nek5000 with autotuning and specialization. In: Proceedings of the 24th ACM International Conference on Supercomputing, pp. 253–262 (2010)

    Google Scholar 

  17. Patera, A.T.: A spectral element method for uid dynamics: laminar flow in a channel expansion. J. Comput. Phys. 54(3), 468–488 (1984)

    Article  MATH  MathSciNet  Google Scholar 

  18. Dongarra, J.J., Du Croz, J., Duff, I.S., Hammarling, S.: Algorithm 679: a set of level 3 basic linear algebra subprograms. ACM Trans. Math. Soft. 16, 18–28 (1990)

    Article  MATH  Google Scholar 

  19. IBM Compilers (2014). http://www-03.ibm.com/software/products/en/subcategory/SW780. Accessed 15 January 2014

  20. Intel Compilers (2014). http://software.intel.com/en-us/intel-compilers. Accessed 15 January 2014

  21. The Portland Group (PGI). http://www.pgroup.com/. Accessed 15 January 2014

  22. The GNU Compiler Collection. http://gcc.gnu.org. Accessed 15 January 2014

  23. Richardson, H.: Domain specific language (DSL) for expressing parallel auto-tuning, CRESTA Project Deliverable D3.6.2 (2014). http://cresta-project.eu/table/deliverables/year-1-deliverables/. Accessed 16 January 2014

  24. Anderson, J.: Modern Compressible Flow: With Historical Perspective. McGraw-Hill, New York (2003)

    Google Scholar 

  25. CRESTA Research Project (2014). http://cresta-project.eu/. Accessed 20 March 2014

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Luis Cebamanos .

Editor information

Editors and Affiliations

A Example of DSL Script

A Example of DSL Script

figure f

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Cebamanos, L., Henty, D., Richardson, H., Hart, A. (2015). Auto-tuning an OpenACC Accelerated Version of Nek5000. In: Markidis, S., Laure, E. (eds) Solving Software Challenges for Exascale. EASC 2014. Lecture Notes in Computer Science(), vol 8759. Springer, Cham. https://doi.org/10.1007/978-3-319-15976-8_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-15976-8_5

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-15975-1

  • Online ISBN: 978-3-319-15976-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics