Accelerating Scientific Applications on Heterogeneous Systems with HybridOMP

Diener, Matthias; Bodony, Daniel J.; Kale, Laxmikant

doi:10.1007/978-3-030-15996-2_13

Matthias Diener²¹,
Daniel J. Bodony²¹ &
Laxmikant Kale²¹

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11333))

Included in the following conference series:

International Conference on Vector and Parallel Processing

461 Accesses

Abstract

High Performance Computing relies on accelerators (such as GPGPUs) to achieve fast execution of scientific applications. Traditionally these accelerators have been programmed with specialized languages, such as CUDA or OpenCL. In recent years, OpenMP emerged as a promising alternative for supporting accelerators, providing advantages such as maintaining a single code base for the host and different accelerator types and providing a simple way to extend support for accelerators to existing code. Efficiently using this support requires solving several challenges, related to performance, work partitioning, and concurrent execution on multiple device types. In this paper, we discuss these challenges and introduce a library, HybridOMP, that addresses several of them, thus enabling the effective use of OpenMP for accelerators. We apply HybridOMP to a scientific application, PlasCom2, that has not previously been able to use accelerators. Experiments on three architectures show that HybridOMP results in performance gains of up to 10x compared to CPU-only execution. Concurrent execution on the host and GPU resulted in additional gains of up to 10% compared to running on the GPU only.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Portability and Scalability of OpenMP Offloading on State-of-the-Art Accelerators

Using C++ AMP to Accelerate HPC Applications on Multiple Platforms

Source-to-Source Parallelization Compilers for Scientific Shared-Memory Multi-core and Accelerated Multiprocessing: Analysis, Pitfalls, Enhancement and Potential

Article 08 August 2019

References

Augonnet, C., Thibault, S., Namyst, R., Wacrenier, P.A.: StarPU: a unified platform for task scheduling on heterogeneous multicore architectures. Concurr. Comput.: Pract. Exp. 23(2), 187–198 (2011). https://doi.org/10.1002/cpe.1631
Article Google Scholar
Bauer, M., Treichler, S., Slaughter, E., Aiken, A.: Legion: expressing locality and independence with logical regions. In: International Conference for High Performance Computing, Networking, Storage and Analysis (SC) (2012). https://doi.org/10.1109/SC.2012.71
Bercea, G.T., et al.: Performance analysis of OpenMP on a GPU using a CORAL proxy application. In: International Workshop on Performance Modeling, Benchmarking, and Simulation of High Performance Computing Systems (PBMS), pp. 1–11 (2015). https://doi.org/10.1145/2832087.2832089
Carter Edwards, H., Trott, C.R., Sunderland, D.: Kokkos: enabling manycore performance portability through polymorphic memory access patterns. J. Parallel Distrib. Comput. 74(12), 3202–3216 (2014). https://doi.org/10.1016/j.jpdc.2014.07.003
Article Google Scholar
Diener, M., White, S., Kale, L.V., Campbell, M., Bodony, D.J., Freund, J.B.: Improving the memory access locality of hybrid MPI applications. In: European MPI Users’ Group Meeting (EuroMPI), pp. 1–10. ACM Press, New York (2017). https://doi.org/10.1145/3127024.3127038
Gautier, T., Lima, J.V.F., Maillard, N., Raffin, B.: XKaapi: a runtime system for data-flow task programming on heterogeneous architectures. In: International Parallel and Distributed Processing Symposium (IPDPS), pp. 1299–1308 (2013). https://doi.org/10.1109/IPDPS.2013.66
Gregory, K., Miller, A.: C++ AMP: Accelerated Massive Parallelism with Microsoft Visual C++. Microsoft Press (2012)
Google Scholar
Jacob, A.C., et al.: Efficient fork-join on GPUs through warp specialization. In: 2017 IEEE 24th International Conference on High Performance Computing (HiPC), pp. 358–367. IEEE, December 2017. https://doi.org/10.1109/HiPC.2017.00048
Khronos Group: OpenCL 2.2 Reference Guide. Technical report (2017)
Google Scholar
Ǵomez Luna, J., et al.: Collaborative heterogeneous applications for integrated-architectures. In: ISPASS 2017 - IEEE International Symposium on Performance Analysis of Systems and Software, pp. 43–54 (2017). https://doi.org/10.1109/ISPASS.2017.7975269
Nvidia: CUDA C programming guide, version 9.1. Technical report (2018)
Google Scholar
OpenACC-Standard.org: OpenACC Programming and Best Practices Guide. Technical report, June 2015
Google Scholar
OpenMP Architecture Review Board: OpenMP Application Program Interface, Version 4.0 (2013)
Google Scholar
Planas, J., Badia, R.M., Ayguadé, E., Labarta, J.: Self-adaptive OmpSs tasks in heterogeneous environments. In: PAR International. pp. 138–149 (2013). https://doi.org/10.1109/IPDPS.2013.53
Sun, Y.G., et al.: Hetero-mark, a benchmark suite for CPU-GPU collaborative computing. In: Proceedings of the 2016 IEEE International Symposium on Workload Characterization, IISWC 2016, pp. 13–22 (2016). https://doi.org/10.1109/IISWC.2016.7581262

Download references

Acknowledgments

This material is based in part upon work supported by the Department of Energy, National Nuclear Security Administration, under Award Number DE-NA0002374.

Author information

Authors and Affiliations

University of Illinois at Urbana-Champaign, Urbana, USA
Matthias Diener, Daniel J. Bodony & Laxmikant Kale

Authors

Matthias Diener
View author publications
You can also search for this author in PubMed Google Scholar
Daniel J. Bodony
View author publications
You can also search for this author in PubMed Google Scholar
Laxmikant Kale
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Matthias Diener .

Editor information

Editors and Affiliations

Federal University of São Carlos, São Carlos, São Paulo, Brazil
Hermes Senger
Lawrence Berkeley National Laboratory, Berkeley, CA, USA
Osni Marques
Universidade Estadual Paulista Júlio de Mesquita Filho, Presidente Prudente, São Paulo, Brazil
Rogerio Garcia
Universidade Estadual Paulista Júlio de Mesquita Filho, São Paulo, São Paulo, Brazil
Tatiana Pinheiro de Brito
Universidade Estadual Paulista Júlio de Mesquita Filho, São Paulo, São Paulo, Brazil
Rogério Iope
Universidade Estadual Paulista Júlio de Mesquita Filho, São Paulo, São Paulo, Brazil
Silvio Stanzani
Universidad Nacional de San Luis, San Luis, Argentina
Veronica Gil-Costa

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Diener, M., Bodony, D.J., Kale, L. (2019). Accelerating Scientific Applications on Heterogeneous Systems with HybridOMP. In: Senger, H., et al. High Performance Computing for Computational Science – VECPAR 2018. VECPAR 2018. Lecture Notes in Computer Science(), vol 11333. Springer, Cham. https://doi.org/10.1007/978-3-030-15996-2_13

Download citation

DOI: https://doi.org/10.1007/978-3-030-15996-2_13
Published: 26 March 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-15995-5
Online ISBN: 978-3-030-15996-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics