Abstract
This paper advances the state-of-the-art in programming models for exploiting task-level parallelism on heterogeneous many-core systems, presenting a number of extensions to the OpenMP language inspired in the StarSs programming model. The proposed extensions allow the programmer to write portable code easily for a number of different platforms, relieving him/her from developing the specific code to off-load tasks to the accelerators and the synchronization of tasks. Our results obtained from the StarSs instantiations for SMPs, the Cell, and GPUs report reasonable parallel performance. However, the real impact of our approach in is the productivity gains it yields for the programmer.
Similar content being viewed by others
References
Seiler L., Carmean D., Sprangle E., Forsyth T., Abrash M., Dubey P., Junkins S., Lake A., Sugerman J., Cavin R., Espasa R., Grochowski E., Juan T., Hanrahan P.: Larrabee: a many-core × 86 architecture for visual computing. ACM. Trans. Graph. 27(3), 1–15 (2008)
OpenMP Architecture Review Board.: OpenMP 3.0 Specification. http://www.openmp.org May (2008)
Bellens, P., Perez, J.M., Badia, R.M., Labarta, J.: CellSs: a Programming Model for the Cell BE Architecture. In : Proceedings of the ACM/IEEE SC 2006 Conference, November (2006)
Turner, J.A.: Roadrunner: Heterogeneous Petascale Computing for Predictive Simulation. Technical report, Technical Report LANLUR-07-1037, Los Alamos National Lab, Las Vegas, NV (2007)
Kurzak J., Buttari A., Dongarra J.: Solving systems of linear equations on the cell processor using cholesky factorization. IEEE. Trans. Parallel Distrib. Syst. 19(9), 1175–1186 (2008)
Ayguadé E., Copty N., Duran A., Hoeflinger J., Lin Y., Massaioli F., Teruel X., Unnikrishnan P., Zhang G.: The design of OpenMP tasks. IEEE. Trans. Parallel Distrib. Syst. 20(3), 404–418 (2009)
Pham D.C., Aipperspach T., Boerstler D., Bolliger M., Chaudhry R., Cox D., Harvey P., Harvey P.M., Hofstee H.P., Johns C., Kahle J., Kameyama A., Keaty J., Masubuchi Y., Pham M., Pille J., Posluszny S., Riley M., Stasiak D.L., Suzuoki M., Takahashi O., Warnock J., Weitzel S., Wendel D., Yazawa K.: Overview of the architecture, circuit design, and physical implementation of a first-generation cell processor. IEEE J. Solid-State Circuits 41(1), 179–196 (2006)
NVIDIA : NVIDIA CUDA Compute Unified Device Architecture-Programming Guide (2007)
Buck, I., Foley, T., Horn, D., Sugerman, J., Fatahalian, K., Houston, M., Hanrahan, P.: Brook for GPUs: Stream Computing on Graphics Hardware. In : SIGGRAPH ’04: ACM SIGGRAPH 2004 Papers, pp. 777–786. ACM Press, New York (2004)
PGI.: PGI Fortran and C Accelerator Compilers and Programming Model Technology Preview. The Portland Group (2008)
Dolbeau, R., Bihan, S., Bodin, F.: HMPP: A Hybrid Multi-core Parallel Programming Environment. In : First Workshop on General Purpose Processing on Graphics Processing Units, October (2007)
Khronos OpenCL Working Group.: The OpenCL Specification. Aaftab Munshi, Ed (2009)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Ayguadé, E., Badia, R.M., Bellens, P. et al. Extending OpenMP to Survive the Heterogeneous Multi-Core Era. Int J Parallel Prog 38, 440–459 (2010). https://doi.org/10.1007/s10766-010-0135-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10766-010-0135-4
Keywords
Profiles
- Francisco Igual View author profile
- Xavier Martorell View author profile