Skip to main content

Accelerating Scientific Applications on Heterogeneous Systems with HybridOMP

  • Conference paper
  • First Online:
High Performance Computing for Computational Science – VECPAR 2018 (VECPAR 2018)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11333))

Included in the following conference series:

  • 461 Accesses

Abstract

High Performance Computing relies on accelerators (such as GPGPUs) to achieve fast execution of scientific applications. Traditionally these accelerators have been programmed with specialized languages, such as CUDA or OpenCL. In recent years, OpenMP emerged as a promising alternative for supporting accelerators, providing advantages such as maintaining a single code base for the host and different accelerator types and providing a simple way to extend support for accelerators to existing code. Efficiently using this support requires solving several challenges, related to performance, work partitioning, and concurrent execution on multiple device types. In this paper, we discuss these challenges and introduce a library, HybridOMP, that addresses several of them, thus enabling the effective use of OpenMP for accelerators. We apply HybridOMP to a scientific application, PlasCom2, that has not previously been able to use accelerators. Experiments on three architectures show that HybridOMP results in performance gains of up to 10x compared to CPU-only execution. Concurrent execution on the host and GPU resulted in additional gains of up to 10% compared to running on the GPU only.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Augonnet, C., Thibault, S., Namyst, R., Wacrenier, P.A.: StarPU: a unified platform for task scheduling on heterogeneous multicore architectures. Concurr. Comput.: Pract. Exp. 23(2), 187–198 (2011). https://doi.org/10.1002/cpe.1631

    Article  Google Scholar 

  2. Bauer, M., Treichler, S., Slaughter, E., Aiken, A.: Legion: expressing locality and independence with logical regions. In: International Conference for High Performance Computing, Networking, Storage and Analysis (SC) (2012). https://doi.org/10.1109/SC.2012.71

  3. Bercea, G.T., et al.: Performance analysis of OpenMP on a GPU using a CORAL proxy application. In: International Workshop on Performance Modeling, Benchmarking, and Simulation of High Performance Computing Systems (PBMS), pp. 1–11 (2015). https://doi.org/10.1145/2832087.2832089

  4. Carter Edwards, H., Trott, C.R., Sunderland, D.: Kokkos: enabling manycore performance portability through polymorphic memory access patterns. J. Parallel Distrib. Comput. 74(12), 3202–3216 (2014). https://doi.org/10.1016/j.jpdc.2014.07.003

    Article  Google Scholar 

  5. Diener, M., White, S., Kale, L.V., Campbell, M., Bodony, D.J., Freund, J.B.: Improving the memory access locality of hybrid MPI applications. In: European MPI Users’ Group Meeting (EuroMPI), pp. 1–10. ACM Press, New York (2017). https://doi.org/10.1145/3127024.3127038

  6. Gautier, T., Lima, J.V.F., Maillard, N., Raffin, B.: XKaapi: a runtime system for data-flow task programming on heterogeneous architectures. In: International Parallel and Distributed Processing Symposium (IPDPS), pp. 1299–1308 (2013). https://doi.org/10.1109/IPDPS.2013.66

  7. Gregory, K., Miller, A.: C++ AMP: Accelerated Massive Parallelism with Microsoft Visual C++. Microsoft Press (2012)

    Google Scholar 

  8. Jacob, A.C., et al.: Efficient fork-join on GPUs through warp specialization. In: 2017 IEEE 24th International Conference on High Performance Computing (HiPC), pp. 358–367. IEEE, December 2017. https://doi.org/10.1109/HiPC.2017.00048

  9. Khronos Group: OpenCL 2.2 Reference Guide. Technical report (2017)

    Google Scholar 

  10. Ǵomez Luna, J., et al.: Collaborative heterogeneous applications for integrated-architectures. In: ISPASS 2017 - IEEE International Symposium on Performance Analysis of Systems and Software, pp. 43–54 (2017). https://doi.org/10.1109/ISPASS.2017.7975269

  11. Nvidia: CUDA C programming guide, version 9.1. Technical report (2018)

    Google Scholar 

  12. OpenACC-Standard.org: OpenACC Programming and Best Practices Guide. Technical report, June 2015

    Google Scholar 

  13. OpenMP Architecture Review Board: OpenMP Application Program Interface, Version 4.0 (2013)

    Google Scholar 

  14. Planas, J., Badia, R.M., Ayguadé, E., Labarta, J.: Self-adaptive OmpSs tasks in heterogeneous environments. In: PAR International. pp. 138–149 (2013). https://doi.org/10.1109/IPDPS.2013.53

  15. Sun, Y.G., et al.: Hetero-mark, a benchmark suite for CPU-GPU collaborative computing. In: Proceedings of the 2016 IEEE International Symposium on Workload Characterization, IISWC 2016, pp. 13–22 (2016). https://doi.org/10.1109/IISWC.2016.7581262

Download references

Acknowledgments

This material is based in part upon work supported by the Department of Energy, National Nuclear Security Administration, under Award Number DE-NA0002374.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Matthias Diener .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Diener, M., Bodony, D.J., Kale, L. (2019). Accelerating Scientific Applications on Heterogeneous Systems with HybridOMP. In: Senger, H., et al. High Performance Computing for Computational Science – VECPAR 2018. VECPAR 2018. Lecture Notes in Computer Science(), vol 11333. Springer, Cham. https://doi.org/10.1007/978-3-030-15996-2_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-15996-2_13

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-15995-5

  • Online ISBN: 978-3-030-15996-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics