Autonomic Coordination of Skeleton-Based Applications Over CPU/GPU Multi-Core Architectures

Goli, Mehdi; González–Vélez, Horacio

doi:10.1007/s10766-016-0419-4

Autonomic Coordination of Skeleton-Based Applications Over CPU/GPU Multi-Core Architectures

Published: 30 March 2016

Volume 45, pages 203–224, (2017)
Cite this article

International Journal of Parallel Programming Aims and scope Submit manuscript

Mehdi Goli¹ &
Horacio González–Vélez²

263 Accesses
3 Citations
Explore all metrics

Abstract

Widely adumbrated as patterns of parallel computation and communication, algorithmic skeletons introduce a viable solution for efficiently programming modern heterogeneous multi-core architectures equipped not only with traditional multi-core CPUs, but also with one or more programmable Graphics Processing Units (GPUs). By systematically applying algorithmic skeletons to address complex programming tasks, it is arguably possible to separate the coordination from the computation in a parallel program, and therefore subdivide a complex program into building blocks (modules, skids, or components) that can be independently created and then used in different systems to drive multiple functionalities. By exploiting such systematic division, it is feasible to automate coordination by addressing extra-functional and non-functional features such as application performance, portability, and resource utilisation from the component level in heterogeneous multi-core architectures. In this paper, we introduce a novel approach to exploit the inherent features of skeleton-based applications in order to automatically coordinate them over heterogeneous (CPU/GPU) multi-core architectures and improve their performance. Our systematic evaluation demonstrates up to one order of magnitude speed-up on heterogeneous multi-core architectures.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Towards the Transparent Execution of Compound OpenCL Computations in Multi-CPU/Multi-GPU Environments

Hybrid CPU–GPU execution support in the skeleton programming framework SkePU

Article Open access 25 March 2019

SkePU 3: Portable High-Level Programming of Heterogeneous Systems and HPC Clusters

Article Open access 19 May 2021

Notes

It is noted that PEI is an open source framework located at https://github.com/mehdi-goli/MC-FastFlow-PEI.

References

Aldinucci, M., Danelutto, M., Kilpatrick, P.: Management in distributed systems: a semi-formal approach. In: Euro-Par 2007, LNCS, vol 4641, Springer, Rennes, pp 651–661 (2007)
Aldinucci, M., Campa, S., Danelutto, M., Vanneschi, M.: Behavioural skeletons in GCM: autonomic management of grid components. In: PDP 2008, IEEE, Toulouse, pp 54–63 (2008)
Aldinucci, M., Danelutto, M., Zoppi, G., Kilpatrick, P.: Advances in autonomic components & services. In: From grids to service and pervasive computing, Springer, pp 3–17 (2008b)
Aldinucci, M., Danelutto, M., Kilpatrick, P.: Towards hierarchical management of autonomic components: a case study. In: PDP 2009, IEEE, Weimar, pp 3–10 (2009)
Aldinucci, M., Danelutto, M., Kilpatrick, P., Torquati, M.: FastFlow: high-level and efficient streaming on multi-core. In: Pllana, S., Xhafa, F. (eds), Programming Multi-core and Many-core Computing Systems, no. 66 in Wiley Series on Parallel and Distributed Computing, Wiley (2014)
Bharadwaj, V., Ghose, D., Robertazzi, T.G.: Divisible load theory: a new paradigm for load scheduling in distributed systems. Cluster Comput. 6(1), 7–17 (2003)
Article Google Scholar
Campa, S., Danelutto, M., Goli, M., González-Vélez, H., Popescu, A.M., Torquati, M.: Parallel patterns for heterogeneous CPU/GPU architectures: structured parallelism from cluster to cloud. Future Gener. Comput. Syst. 37, 354–366 (2014)
Article Google Scholar
Clint Whaley, R., Petitet, A., Dongarra, J.J.: Automated empirical optimizations of software and the atlas project. Parallel Comput. 27(1), 3–35 (2001)
Article MATH Google Scholar
Cole, M.: Algorithmic skeletons: structured management of parallel computation. Research monographs in parallel & distributed computing. Pitman/MIT Press, London (1989)
MATH Google Scholar
Danelutto, M., Torquati, M.: Structured parallel programming with “core” FastFlow. In: CEFP2013 5th Summer School (revised selected papers), Springer, Cluj-Napoca, LNCS, vol 8606, pp 29–75 (2013)
Danelutto, M., Zoppi, G.: Behavioural skeletons meeting services. In: ICCS 2008, LNCS, vol 5101, Springer, Kraków, pp 146–153 (2008)
Donadio, S., Brodman, J., Roeder, T., Yotov, K., Barthou, D., Cohen, A., Garzarán, M.J., Padua, D., Pingali, K.: A language for the compact representation of multiple programversions. In: LCPC 2005, LNCS, vol 4339, Springer, Hawthorne, pp 136–151 (2006)
Enmyren, J., Kessler, C.W.: SkePU: a multi-backend skeleton programming library for multi-GPU systems. In: Proceedings of the fourth international workshop on High-level parallel programming and applications, ACM, pp 5–14 (2010)
Ernsting, S., Kuchen, H.: Algorithmic skeletons for multi-core, multi-GPU systems and clusters. Int. J. High Perform. Comput. Network. 7(2), 129–138 (2012)
Article Google Scholar
Fialka, O., Cadik, M.: FFT and convolution performance in image filtering on GPU. CIV2006, pp. 609–614. IEEE, London (2006)
Frigo, M., Johnson, S.G.: FFTW: An adaptive software architecture for the FFT. In: ICSSP’98, IEEE, Seattle, vol 3, pp 1381–1384 (1998)
Goli, M.: Autonomic behavioural framework for structural parallelism over heterogeneous multi-core systems. PhD thesis, Robert Gordon University, Aberdeen, UK, http://hdl.handle.net/10059/1373 (2015)
Goli, M., González-Vélez, H.: Heterogeneous Algorithmic Skeletons for Fast Flowwith Seamless Coordination over Hybrid Architectures. In: PDP 2013, IEEE, Belfast,pp 148–156 (2013)
Goli, M., González-Vélez, H.: N-body computations using skeletal frameworks on multicore cpu/graphics processing unit architectures: an empirical performance evaluation. Concurr. Comput. 26(4), 972–986 (2014)
Article Google Scholar
Goli, M., McCall, J., Brown, C., Janjic, V., Hammond, K.: Mapping parallel programs to heterogeneous CPU/GPU architectures using a Monte Carlo tree search. In: CEC2013, IEEE, Cancun, pp 2932–2939 (2013)
González-Vélez, H., Leyton, M.: A survey of algorithmic skeleton frameworks: high-level structured parallel programming enablers. Softw. Pract. Exp. 40(12), 1135–1160 (2010)
Article Google Scholar
Hammond, K., Aldinucci, M., Brown, C., Cesarini, F., Danelutto, M.,González-Vélez, H., Kilpatrick, P., Keller, R., Rossbory, M., Shainer, G.: The ParaPhrase project: Parallel patterns for adaptive heterogeneous multicore systems. In: FMCO 2011- Revised Selected Papers, Springer, Turin, LNCS, vol 7542, pp 218–236 (2011)
Hintjens, P.: ZeroMQ: Messaging for Many Applications. O’Reilly Media, Inc, Sebastopol (2013)
Google Scholar
Hwu, WmW: GPU Computing Gems Jade Edition. Morgan Kaufmann, (2011)
Katagiri, T., Kise, K., Honda, H., Yuba, T.: Fiber: A generalized framework for auto-tuning software. In: High Performance Computing, Springer, pp 146–159 (2003)
Kephart, J.O., Chess, D.M.: The vision of autonomic computing. Computer 36(1), 41–50 (2003)
Article MathSciNet Google Scholar
Owens, J.D., Luebke, D., Govindaraju, N., Harris, M., Krger, J., Lefohn, A.E., Purcell, T.J.: A survey of general-purpose computation on graphics hardware. Comput. Graph. Forum 26(1), 80–113 (2007)
Article Google Scholar
Pancake, C.M.: Performance tools for today’s HPC: Are we addressing the right issues? Parallel Comput. 27(11), 1403–1415 (2001)
Article MATH Google Scholar
Puschel, M., Moura, J.M., Johnson, J.R., Padua, D., Veloso, M.M., Singer, B.W., Xiong, J., Franchetti, F., Gacic, A., Voronenko, Y., et al.: SPIRAL: Code generation for DSP transforms. Proc. IEEE 93(2), 232–275 (2005)
Article Google Scholar
Schaefer, C.A., Pankratius, V., Tichy, W.F.: Atune-il: An instrumentation language for auto-tuning parallel applications. In: EuroPar 2009, Springer, Delft, LNCS, vol 5704, pp 9–20 (2009)
Steuwer, M., Kegel, P., Gorlatch, S.: SkelCL-a portable skeleton library for high-level GPU programming. In: IPDPS 2011, IEEE, Anchorage, pp 1176–1182 (2011)

Download references

Author information

Authors and Affiliations

Robert Gordon University, Aberdeen, UK
Mehdi Goli
National College of Ireland, Dublin, Ireland
Horacio González–Vélez

Authors

Mehdi Goli
View author publications
You can also search for this author in PubMed Google Scholar
Horacio González–Vélez
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Horacio González–Vélez.

Additional information

Research supported by the European Commission FP7 Project “ParaPhrase: Parallel Patterns for Adaptive Heterogeneous Multicore Systems” Under Contract No.: 288570.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Goli, M., González–Vélez, H. Autonomic Coordination of Skeleton-Based Applications Over CPU/GPU Multi-Core Architectures. Int J Parallel Prog 45, 203–224 (2017). https://doi.org/10.1007/s10766-016-0419-4

Download citation

Received: 05 September 2015
Accepted: 23 March 2016
Published: 30 March 2016
Issue Date: April 2017
DOI: https://doi.org/10.1007/s10766-016-0419-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Autonomic Coordination of Skeleton-Based Applications Over CPU/GPU Multi-Core Architectures

Abstract

Access this article

Similar content being viewed by others

Towards the Transparent Execution of Compound OpenCL Computations in Multi-CPU/Multi-GPU Environments

Hybrid CPU–GPU execution support in the skeleton programming framework SkePU

SkePU 3: Portable High-Level Programming of Heterogeneous Systems and HPC Clusters

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Autonomic Coordination of Skeleton-Based Applications Over CPU/GPU Multi-Core Architectures

Abstract

Access this article

Similar content being viewed by others

Towards the Transparent Execution of Compound OpenCL Computations in Multi-CPU/Multi-GPU Environments

Hybrid CPU–GPU execution support in the skeleton programming framework SkePU

SkePU 3: Portable High-Level Programming of Heterogeneous Systems and HPC Clusters

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation