Skip to main content
Log in

A preliminary evaluation of OpenACC implementations

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

During the last few years, the availability of hardware accelerators, such as GPUs, has rapidly increased. However, the entry cost to GPU programming is high and requires a considerable porting and tuning effort. Some research groups and vendors have made attempts to ease the situation by defining APIs and languages that simplify these tasks. In the wake of the success of OpenMP, industria and academia are working toward defining a new standard of compiler directives to leverage the GPU programming effort. Support from vendors and similarities with the upcoming OpenMP 4.0 standard lead us to believe that OpenACC is a good alternative for developers who want to port existing codes to accelerators. In this paper, we evaluate three OpenACC implementations: two commercial implementations (PGI and CAPS) and our own research implementation, accULL, to evaluate the current status and future directions of the standard.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Listing 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  1. Blackford LS, Demmel J, Dongarra J, Duff I, Hammarling S, Henry G, Heroux M, Kaufman L, Lumsdaine A, Petitet A, Pozo R, Remington K, Whaley RC (2001) An updated set of basic linear algebra subprograms (BLAS). ACM Trans Math Softw 28(2):135–151

    Article  Google Scholar 

  2. Bodin F, Bihan S (2009) Heterogeneous multicore parallel programming for graphics processing units. Sci Program 17(4):325–336. http://dl.acm.org/citation.cfm?id=1662626.1662632

    Google Scholar 

  3. Che S, Sheaffer JW, Boyer M, Szafaryn LG, Wang L, Skadron K (2010) A characterization of the Rodinia benchmark suite with comparison to contemporary CMP workloads. In: Proceedings of the IEEE international symposium on workload characterization (IISWC’10), IISWC ’10. IEEE Computer Society, Washington, pp 1–11

    Chapter  Google Scholar 

  4. Faulk S, Porter A, Gustafson J, Tichy W, Johnson P, Votta L (2004) Measuring high performance computing productivity. Int J High Perform Comput Appl 18(4):459–473

    Article  Google Scholar 

  5. Lusk E, Yelick K (2007) Languages for high-productivity computing: the DARPA HPCS language project. Parallel Process Lett 17(1):89–102

    Article  MathSciNet  Google Scholar 

  6. Nickolls J, Buck I, Garland M, Skadron K (2008) Scalable parallel programming with CUDA. Queue 6(2):40–53

    Article  Google Scholar 

  7. OpenACC: OpenACC directives for accelerators (2011). http://www.openacc-standard.org (Online; Last accessed October 2012)

  8. Reyes R, López-Rodríguez I, Fumero JJ, de Sande F (2012) accULL: an OpenACC implementation with CUDA and OpenCL support. In: Kaklamanis C, Papatheodorou TS, Spirakis PG (eds) Euro-Par 2012 parallel processing—18th international conference, Euro-Par 2012, Rhodes Island, Greece, August 27–31, 2012. Lecture notes in computer science, vol 7484. Springer, Rhodes Island, pp 871–882

    Chapter  Google Scholar 

  9. Reyes R, de Sande F (2012) accULL project home page. http://cap.pcg.ull.es/accULL (Online; Last accessed November 2012)

  10. Reyes R, de Sande F (2012) Optimization strategies in different CUDA architectures using llCoMP. Microprocess Microsyst 36(2):78–87

    Article  Google Scholar 

  11. Wolfe M (2010) Implementing the PGI Accelerator model. In: Proceedings of the 3rd workshop on general-purpose computation on graphics processing units, GPGPU ’10. ACM, New York, pp 43–50

    Chapter  Google Scholar 

Download references

Acknowledgements

This work has been partially supported by the EU (FEDER), the Spanish MEC (Plan Nacional de I+D+I, contracts TIN2008-06570-C04-03 and TIN2011-24598), HPC-EUROPA2 (project number 228398) and the Canary Islands Government, ACIISI (contract PI2008/285).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Francisco de Sande.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Reyes, R., López, I., Fumero, J.J. et al. A preliminary evaluation of OpenACC implementations. J Supercomput 65, 1063–1075 (2013). https://doi.org/10.1007/s11227-012-0853-z

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-012-0853-z

Keywords

Navigation