Vecpar – A Framework for Portability and Parallelization

Mania, Georgiana; Styles, Nicholas; Kuhn, Michael; Salzburger, Andreas; Yeo, Beomki; Ludwig, Thomas

doi:10.1007/978-3-031-35995-8_18

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14073))

Included in the following conference series:

International Conference on Computational Science

765 Accesses

Abstract

Complex particle reconstruction software used by High Energy Physics experiments already pushes the edges of computing resources with demanding requirements for speed and memory throughput, but the future experiments pose an even greater challenge. Although many supercomputers have already reached petascale capacities using many-core architectures and accelerators, numerous scientific applications still need to be adapted to make use of these new resources. To ensure a smooth transition to a platform-agnostic code base, we developed a prototype of a portability and parallelization framework named vecpar. In this paper, we introduce the technical concepts, the main features and we demonstrate the framework’s potential by comparing the runtimes of the single-source vecpar implementation (compiled for different architectures) with native serial and parallel implementations, which reveal significant speedup over the former and competitive speedup versus the latter. Further optimizations and extended portability options are currently investigated and are therefore the focus of future work.

This work was supported by DASHH under grant number HIDSS-0002.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
A track is a charged particle trajectory through a detector.
2.
https://github.com/wr-hamburg/vecpar.
3.
As an exception motivated by performance optimization purposes, the vecpar API provides a limited number of mutable versions as well.
4.
A parallel_mmap operator which allows mutable data structures is also defined.
5.
https://github.com/wr-hamburg/BabelStream/tree/vecpar.
6.
A set of parameters describing the helical trajectory followed by a charged particle moving within a magnetic field.
7.
TARGET is a vecpar macro which adds extra qualifiers at compile time.
8.
While the initial goal was to contribute to the increase of the performance portability of open-source track reconstruction software, other scientific use cases are welcome in the future.

References

Algebra-plugin. https://github.com/acts-project/algebra-plugins/. Accessed 8 Dec 2022
European Organization for Nuclear Research (CERN). Nature 184(4702), 1844 (1959). https://doi.org/10.1038/1841844b0
Aaij, R., Albrecht, J., Belous, M., Billoir, P., Boettcher, T., et al.: Allen: a high-level trigger on GPUs for LHCb. Comput. Softw. Big Sci. 4(1) (2020). https://doi.org/10.1007/s41781-020-00039-7
Ai, X., Allaire, C., Calace, N., Czirkos, A., Ene, I., Elsing, M., et al.: A common tracking software project. Comput. Softw. Big Sci. (2022). https://doi.org/10.1007/s41781-021-00078-8
Bird, R.S.: An introduction to the theory of lists. In: Broy, M. (ed.) Logic of Programming and Calculi of Discrete Design, pp. 5–42. Springer, Heidelberg (1987). https://doi.org/10.1007/978-3-642-87374-4_1
Chapter Google Scholar
Bocci, A., Kortelainen, M., Innocente, V., Pantaleo, F., Rovere, M.: Heterogeneous reconstruction of tracks and primary vertices with the CMS pixel tracker (2020). https://doi.org/10.48550/ARXIV.2008.13461, http://arxiv.org/2008.13461
Breitbart, J., Fohry, C.: OpenCL - an effective programming model for data parallel computations at the Cell Broadband Engine. In: 24th IEEE International Symposium on Parallel and Distributed Processing, IPDPS 2010, Atlanta, Georgia, USA, 19–23 April 2010 - Workshop Proceedings, pp. 1–8. IEEE (2010). https://doi.org/10.1109/IPDPSW.2010.5470823
Dastgeer, U.: Skeleton programming for heterogeneous GPU-based systems (2011)
Google Scholar
Deakin, T., Poenaru, A., Lin, T., McIntosh-Smith, S.: Tracking performance portability on the yellow brick road to exascale. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 1–13 (2020). https://doi.org/10.1109/P3HPC51967.2020.00006
Deakin, T., Price, J., Martineau, M., McIntosh-Smith, S.: Evaluating attainable memory bandwidth of parallel programming models via BabelStream. Int. J. Comput. Sci. Eng. 17(3), 247–262 (2018). https://doi.org/10.1504/IJCSE.2018.095847. Special Issue on Novel Strategies for Programming Accelerators
Article Google Scholar
del Rio Astorga, D., Dolz, M.F., Fernández, J., García, J.D.: A generic parallel pattern interface for stream and data processing. Concurr. Comput. Pract. Exp. 29(24), e4175 (2017). https://doi.org/10.1002/cpe.4175, https://onlinelibrary.wiley.com/doi/abs/10.1002/cpe.4175
Guennebaud, G., Jacob, B., et al.: Eigen v3 (2010). https://eigen.tuxfamily.org
Leggett, C., et al.: AthenaMT: upgrading the ATLAS software framework for the many-core world with multi-threading. J. Phys.: Conf. Ser. 898, 042009 (2017). https://doi.org/10.1088/1742-6596/898/4/042009
Lund, E., Bugge, L., Gavrilenko, I., Strandlie, A.: Track parameter propagation through the application of a new adaptive Runge-Kutta-Nystrom method in the ATLAS experiment. Technical report (2009). https://cds.cern.ch/record/1113528/files/ATL-SOFT-PUB-2009-001.pdf
Lund, E., Bugge, L., Gavrilenko, I., Strandlie, A.: Transport of covariance matrices in the inhomogeneous magnetic field of the ATLAS experiment by the application of a semi-analytical method. Technical report (2009). https://cds.cern.ch/record/1114177/files/ATL-SOFT-PUB-2009-002.pdf
Nickolls, J., Buck, I., Garland, M., Skadron, K.: Scalable parallel programming with CUDA. In: International Conference on Computer Graphics and Interactive Techniques, SIGGRAPH 2008, Los Angeles, California, USA, 11–15 August 2008, Classes, pp. 16:1–16:14. ACM (2008). https://doi.org/10.1145/1401132.1401152
Organization, O.: The OpenACC application programming interface, version 3.2. https://www.openacc.org/sites/default/files/inline-images/Specification/OpenACC-3.2-final.pdf
van der Pas, R., Stotzer, E., Terboven, C.: Using OpenMP - The Next Step: Affinity, Accelerators, Tasking, and SIMD. MIT Press (2017). https://mitpress.mit.edu/books/using-openmp-next-step
Reinders, J., Ashbaugh, B., Brodman, J., Kinsner, M., Pennycook, J., Tian, X.: Data Parallel C++. Apress, Berkeley (2021). https://doi.org/10.1007/978-1-4842-5574-2
Reyes, R., Lomüller, V.: SYCL: single-source C++ accelerator programming. In: Joubert, G.R., Leather, H., Parsons, M., Peters, F.J., Sawyer, M. (eds.) Parallel Computing: On the Road to Exascale, Proceedings of the International Conference on Parallel Computing, ParCo 2015, 1–4 September 2015, Edinburgh, Scotland, UK. Advances in Parallel Computing, vol. 27, pp. 673–682. IOS Press (2015). https://doi.org/10.3233/978-1-61499-621-7-673
Rohr, D., Gorbunov, S., Schmidt, M.O., Shahoyan, R.: Track reconstruction in the ALICE TPC using GPUs for LHC Run 3 (2018). https://doi.org/10.48550/ARXIV.1811.11481, https://arxiv.org/abs/1811.11481
Salzburger, A., Niermann, J., Yeo, B., Krasznahorkay, A.: Detray: a compile time polymorphic tracking geometry description. J. Phys.: Conf. Ser. 2438(1), 012026 (2023). https://doi.org/10.1088/1742-6596/2438/1/012026
Strohmaier, E., Dongarra, J., Simon, H., Meuer, M.: Top500 List. https://www.top500.org. Accessed 01 Dec 2022
Swatman, S.N., Krasznahorkay, A., Gessinger, P.: Managing heterogeneous device memory using C++17 memory resources. J. Phys.: Conf. Ser. 2438(1), 012050 (2023). https://doi.org/10.1088/1742-6596/2438/1/012050
Trott, C.R., Lebrun-Grandié, D., Arndt, D., Ciesko, J., Dang, V., et al.: Kokkos 3: programming model extensions for the exascale era. IEEE Trans. Parallel Distrib. Syst. 33(4), 805–817 (2022). https://doi.org/10.1109/TPDS.2021.3097283
Article Google Scholar
Zenker, E., et al.: Alpaka - an abstraction library for parallel kernel acceleration. IEEE Computer Society (2016). https://arxiv.org/abs/1602.08477

Download references

Acknowledgment

We acknowledge the support by DASHH (Data Science in Hamburg - HELMHOLTZ Graduate School for the Structure of Matter) with the Grant-No. HIDSS-0002. The National Analysis Facility (NAF) at Deutsches Elektronen-Synchrotron (DESY), the University of Hamburg (UHH) and Deutsche Klimarechenzentrum (DKRZ) provided the hardware resources for the experiments.

Author information

Authors and Affiliations

Deutsches Elektronen-Synchrotron DESY, Notkestr. 85, 22607, Hamburg, Germany
Georgiana Mania & Nicholas Styles
University of Hamburg, Hamburg, Germany
Georgiana Mania & Thomas Ludwig
Otto von Guericke University Magdeburg, Magdeburg, Germany
Michael Kuhn
CERN, 1211, Geneva, Switzerland
Andreas Salzburger
Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA
Beomki Yeo
Department of Physics, University of California, Berkeley, CA, 94720, USA
Beomki Yeo
Deutsches Klimarechenzentrum, Bundesstraße 45a, 20146, Hamburg, Germany
Thomas Ludwig

Authors

Georgiana Mania
View author publications
You can also search for this author in PubMed Google Scholar
Nicholas Styles
View author publications
You can also search for this author in PubMed Google Scholar
Michael Kuhn
View author publications
You can also search for this author in PubMed Google Scholar
Andreas Salzburger
View author publications
You can also search for this author in PubMed Google Scholar
Beomki Yeo
View author publications
You can also search for this author in PubMed Google Scholar
Thomas Ludwig
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Georgiana Mania .

Editor information

Editors and Affiliations

Czech Technical University in Prague, Prague, Czech Republic
Jiří Mikyška
University of Amsterdam, Amsterdam, The Netherlands
Clélia de Mulatier
AGH University of Science and Technology, Krakow, Poland
Maciej Paszynski
University of Amsterdam, Amsterdam, The Netherlands
Valeria V. Krzhizhanovskaya
University of Tennessee at Knoxville, Knoxville, TN, USA
Jack J. Dongarra
University of Amsterdam, Amsterdam, The Netherlands
Peter M.A. Sloot

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Mania, G., Styles, N., Kuhn, M., Salzburger, A., Yeo, B., Ludwig, T. (2023). Vecpar – A Framework for Portability and Parallelization. In: Mikyška, J., de Mulatier, C., Paszynski, M., Krzhizhanovskaya, V.V., Dongarra, J.J., Sloot, P.M. (eds) Computational Science – ICCS 2023. ICCS 2023. Lecture Notes in Computer Science, vol 14073. Springer, Cham. https://doi.org/10.1007/978-3-031-35995-8_18

Download citation

DOI: https://doi.org/10.1007/978-3-031-35995-8_18
Published: 26 June 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-35994-1
Online ISBN: 978-3-031-35995-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Vecpar – A Framework for Portability and Parallelization