OpenMP Target Device Offloading for the SX-Aurora TSUBASA Vector Engine

Cramer, Tim; Römmer, Manoel; Kosmynin, Boris; Focht, Erich; Müller, Matthias S.

doi:10.1007/978-3-030-43229-4_21

OpenMP Target Device Offloading for the SX-Aurora TSUBASA Vector Engine

Tim Cramer¹²,
Manoel Römmer¹²,
Boris Kosmynin¹²,
Erich Focht¹³ &
…
Matthias S. Müller¹²

Conference paper
First Online: 19 March 2020

1035 Accesses
4 Citations

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12043))

Abstract

Driven by the heterogeneity trend in modern supercomputers, OpenMP provides support for heterogeneous systems since 2013. Having a single programming model for all kinds of accelerator-based systems decreases the burden of code porting to different device types. The acceptance of this heterogeneous paradigm requires the availability of corresponding OpenMP compiler and runtime environments supporting different target device architectures. The LLVM/Clang infrastructure is designated to extend the offloading features for any new target platform. However, this supposes a compatible compiler backend for the target architecture. In order to overcome this limitation we present a source-to-source code transformation technique which outlines the OpenMP code regions for the target device. By combining this technique with a corresponding communication layer, we enable OpenMP target offloading to the NEC SX-Aurora TSUBASA vector engine, which represents the new generation of vector computing.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
https://github.com/veos-sxarr-NEC/veoffload.
2.
https://github.com/RWTH-HPC.
3.
https://rwth-hpc.github.io/sx-aurora-offloading.
4.
https://github.com/clang-omp/OffloadingDesign.
5.
Clang allows to define x86 as target device for testing purpose, where the target regions are executed on the host, but using the corresponding plugin in libomptarget.

References

The Riken Himeno CFD Benchmark. http://accc.riken.jp/en/supercom/documents/himenobmt
Álvarez, Á., Ugarte, Í., Fernández, V., Sánchez, P.: OpenMP dynamic device offloading in heterogeneous platforms. In: Fan, X., de Supinski, B.R., Sinnen, O., Giacaman, N. (eds.) IWOMP 2019. LNCS, vol. 11718, pp. 109–122. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-28596-8_8
Chapter Google Scholar
Antao, S.F., et al.: Offloading support for OpenMP in Clang and LLVM. In: Proceedings of the Third Workshop on LLVM Compiler Infrastructure in HPC, LLVM-HPC 2016, pp. 1–11. IEEE Press, Piscataway (2016)
Google Scholar
Bertolli, C., et al.: Integrating GPU support for OpenMP offloading directives into Clang. In: Proceedings of the Second Workshop on the LLVM Compiler Infrastructure in HPC. ACM, New York (2015)
Google Scholar
Diaz, J.M., Pophale, S., Friedline, K., Hernandez, O., Bernholdt, D.E., Chandrasekaran, S.: Evaluating support for OpenMP offload features. In: Proceedings of the 47th International Conference on Parallel Processing Companion, ICPP 2018, pp. 31:1–31:10. ACM, New York (2018)
Google Scholar
Diaz, J.M., Pophale, S., Hernandez, O., Bernholdt, D.E., Chandrasekaran, S.: OpenMP 4.5 validation and verification suite for device offload. In: de Supinski, B.R., Valero-Lara, P., Martorell, X., Mateo Bellido, S., Labarta, J. (eds.) IWOMP 2018. LNCS, vol. 11128, pp. 82–95. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-98521-3_6
Chapter Google Scholar
Hart, A.: First experiences porting a parallel application to a hybrid supercomputer with OpenMP4.0 device constructs. In: Terboven, C., de Supinski, B.R., Reble, P., Chapman, B.M., Müller, M.S. (eds.) IWOMP 2015. LNCS, vol. 9342, pp. 73–85. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24595-9_6
Chapter Google Scholar
Ishizaka, K., Marukawa, K., Focht, E., Moll, S., Kurtenacker, M., Hack, S.: NEC SX-Aurora - A Scalable Vector Architecture. LLVM Developers’ Meeting (2018)
Google Scholar
Mitra, G., Stotzer, E., Jayaraj, A., Rendell, A.P.: Implementation and optimization of the OpenMP accelerator model for the TI keystone II architecture. In: DeRose, L., de Supinski, B.R., Olivier, S.L., Chapman, B.M., Müller, M.S. (eds.) IWOMP 2014. LNCS, vol. 8766, pp. 202–214. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-11454-5_15
Chapter Google Scholar
Newburn, C.J., et al.: Offload compiler runtime for the Intel® Xeon Phi coprocessor. In: 2013 IEEE International Symposium on Parallel Distributed Processing, Workshops and Phd Forum, pp. 1213–1225, May 2013
Google Scholar
OpenMP Architecture Review Board: OpenMP Application Program Interface, Version 5.0, November 2018
Google Scholar
Sommer, L., Korinth, J., Koch, A.: OpenMP device offloading to FPGA accelerators. In: 2017 IEEE 28th International Conference on Application-specific Systems, Architectures and Processors (ASAP), pp. 201–205, July 2017
Google Scholar
Yamada, Y., Momose, S.: Vector Engine Processor of NEC’s Brand-New Supercomputer SX-Aurora TSUBASA. Hot Chips Symposium on High Performance Chips, August 2018. https://www.hotchips.org. Accessed 05/19

Download references

Author information

Authors and Affiliations

IT Center, RWTH Aachen University, Aachen, Germany
Tim Cramer, Manoel Römmer, Boris Kosmynin & Matthias S. Müller
NEC Cooperation, Stuttgart, Germany
Erich Focht

Authors

Tim Cramer
View author publications
You can also search for this author in PubMed Google Scholar
Manoel Römmer
View author publications
You can also search for this author in PubMed Google Scholar
Boris Kosmynin
View author publications
You can also search for this author in PubMed Google Scholar
Erich Focht
View author publications
You can also search for this author in PubMed Google Scholar
Matthias S. Müller
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tim Cramer .

Editor information

Editors and Affiliations

Czestochowa University of Technology, Czestochowa, Poland
Roman Wyrzykowski
University of Southern California, Marina del Rey, CA, USA
Ewa Deelman
University of Tennessee, Knoxville, TN, USA
Jack Dongarra
Czestochowa University of Technology, Czestochowa, Poland
Konrad Karczewski

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Cramer, T., Römmer, M., Kosmynin, B., Focht, E., Müller, M.S. (2020). OpenMP Target Device Offloading for the SX-Aurora TSUBASA Vector Engine. In: Wyrzykowski, R., Deelman, E., Dongarra, J., Karczewski, K. (eds) Parallel Processing and Applied Mathematics. PPAM 2019. Lecture Notes in Computer Science(), vol 12043. Springer, Cham. https://doi.org/10.1007/978-3-030-43229-4_21

Download citation

DOI: https://doi.org/10.1007/978-3-030-43229-4_21
Published: 19 March 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-43228-7
Online ISBN: 978-3-030-43229-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics