Scaling the GCR Solver Using a High-Level Stencil Framework on Multi- and Many-Core Architectures

Ciznicki, Milosz; Kulczewski, Michal; Kopta, Piotr; Kurowski, Krzysztof

doi:10.1007/978-3-319-32152-3_55

Milosz Ciznicki¹⁹,
Michal Kulczewski¹⁹,
Piotr Kopta¹⁹ &
…
Krzysztof Kurowski¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9574))

1208 Accesses
4 Citations

Abstract

The recent advent of novel multi- and many-core architectures forces application programmers to deal with hardware-specific implementation details and to be familiar with software optimization techniques to benefit from new high-performance computing machines. An extra care must be taken for communication-intensive algorithms, which may be a bottleneck for forthcoming era of exascale computing. This paper aims to present a high level stencil framework implemented for the EULAG model that efficiently utilizes heterogeneous clusters. Only an efficient usage of both CPUs and GPUs with the flexible data decomposition method can lead to the maximum performance that scales communication-intensive elliptic solver with preconditioner.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Kurzak, J., Bader, D., Dongarra, J.: Scientific Computing with Multicore and Accelerators. Computer and Information Science Series. Chapmann & Hall/CRC, Boca Raton (2010)
Book Google Scholar
Georgescu, S., Okuda, H.: Conjugate gradients on multiple GPUs. Int J. Numer. Meth. Fluids 64, 1254–1273 (2010)
Article MathSciNet MATH Google Scholar
Zhang, Y., Cohen, J.M., Owens, J.D.: Fast tridiagonal solvers on GPU. In: Newsletter ACM SIGPLAN Notices - PPoPP, vol. 45, p. 5 (2010)
Google Scholar
Prusa, J.M., Smolarkiewicz, P.K., Wyszogrodzki, A.: Eulag a computational model for multiscale flows. Comput. Fluids 37, 1193–1207 (2008)
Article MathSciNet MATH Google Scholar
Smolarkiewicz, P.K., Margolin, L.G.: Variational methods for elliptic problems in fluid models. In: Proceedings of ECMWF Workshop on Developments in Numerical Methods for Very High Resolution Global Models, vol. 7, pp. 137–159 (2000)
Google Scholar
Kamil, S., Chan, C., Oliker, L., Shalf, J., Williams, S.: An auto-tuning framework for parallel multicore stencil computations. In: IEEE International Symposium on Parallel and Distributed Processing (IPDPS 2010), pp. 1–12. IEEE (2010)
Google Scholar
Christen, M., Schenk, O., Burkhart, H.: Patus: a code generation and autotuning framework for parallel iterative stencil computations on modern microarchitectures. In: IEEE International Parallel and Distributed Processing Symposium (IPDPS 2011), pp. 676–687. IEEE (2011)
Google Scholar
Lutz, T., Fensch, C., Cole, M.: PARTANS: an autotuning framework for stencil computation on multi-GPU systems. ACM Trans. Archit. Code Optim. (TACO) 9(4), 59 (2013)
Google Scholar
Blazewicz, M., Hinder, I., Koppelman, D.M., Brandt, S.R., Ciznicki, M., Kierzynka, M., Löffler, F., Schnetter, E., Tao, J.: From physics model to results: an optimizing framework for cross-architecture code generation. Sci. Program. 21(1–2), 1–16 (2013)
Google Scholar
Szustak, L., Rojek, K., Olas, T., Kuczynski, L., Halbiniak, K., Gepner, P.: Adaptation of MPDATA heterogeneous stencil computation to Intel Xeon Phi coprocessor. Sci. Program. (2015)
Google Scholar
Wyrzykowski, R., Szustak, L., Rojek, K.: Parallelization of 2D MPDATA EULAG algorithm on hybrid architectures with GPU accelerators. Parallel Comput. 40, 425–447 (2014)
Article MathSciNet Google Scholar
Maruyama, N., Nomura, T., Sato, K., Matsuoka, S.: Physis: an implicitly parallel programming model for stencil computations on large-scale GPU-accelerated supercomputers. In: International Conference for High Performance Computing, Networking, Storage and Analysis (SC 2011), pp. 1–12. IEEE (2011)
Google Scholar
Pereira, A.D., Ramos, L., Góes, L.F.: PSkel: a stencil programming framework for CPU-GPU systems. In: Practice and Experience, Concurrency and Computation (2015)
Google Scholar
Rojek, K.A., Ciznicki, M., Rosa, B., Kopta, P., Kulczewski, M., Kurowski, K., Piotrowski, Z.P., Szustak, L., Wojcik, D.K., Wyrzykowski, R.: Adaptation of fluid model EULAG to graphics processing unit architecture. In: Practice and Experience, Concurrency and Computation (2014)
Google Scholar
Xue, W., Yang, C., Fu, H., Wang, X., Xu, Y., Gan, L., Lu, Y., Zhu, X.: Enabling and scaling a global shallow-water atmospheric model on tianhe-2. In: IEEE 28th International Parallel and Distributed Processing Symposium, pp. 745–754. IEEE (2014)
Google Scholar
Ciznicki, M., Kopta, P., Kulczewski, M., Kurowski, K., Gepner, P.: Elliptic solver performance evaluation on modern hardware architectures. In: Wyrzykowski, R., Dongarra, J., Karczewski, K., Waśniewski, J. (eds.) PPAM 2013, Part I. LNCS, vol. 8384, pp. 155–165. Springer, Heidelberg (2014)
Chapter Google Scholar

Download references

Acknowledgements

This work is supported by the Polish National Center of Science under Grant No. UMO-2011/03/B/ST6/03500. This research was supported in part by PL-Grid Infrastructure. This work was supported by a grant from the Swiss National Supercomputing Centre (CSCS) under project ID d25.

Author information

Authors and Affiliations

Poznań Supercomputing and Networking Center, Poznań, Poland
Milosz Ciznicki, Michal Kulczewski, Piotr Kopta & Krzysztof Kurowski

Authors

Milosz Ciznicki
View author publications
You can also search for this author in PubMed Google Scholar
Michal Kulczewski
View author publications
You can also search for this author in PubMed Google Scholar
Piotr Kopta
View author publications
You can also search for this author in PubMed Google Scholar
Krzysztof Kurowski
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Milosz Ciznicki .

Editor information

Editors and Affiliations

Czestochowa University of Technolog, Czestochowa, Poland
Roman Wyrzykowski
Department of Computer Science, University of Southern California, Marina Del Rey, California, USA
Ewa Deelman
Electrical Engineering & Comput. Science, University of Tennessee, Knoxville, Tennessee, USA
Jack Dongarra
Czestochowa University of Technology, Institute of Computer & Information Sci., Czestochowa, Poland
Konrad Karczewski
Department of Computer Science, AGH University of Science and Technology, Krakow, Poland
Jacek Kitowski
Systèmes d’informations, Big Data et Rec, AGH University of Science and Technology, Krakow, Poland
Kazimierz Wiatr

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ciznicki, M., Kulczewski, M., Kopta, P., Kurowski, K. (2016). Scaling the GCR Solver Using a High-Level Stencil Framework on Multi- and Many-Core Architectures. In: Wyrzykowski, R., Deelman, E., Dongarra, J., Karczewski, K., Kitowski, J., Wiatr, K. (eds) Parallel Processing and Applied Mathematics. Lecture Notes in Computer Science(), vol 9574. Springer, Cham. https://doi.org/10.1007/978-3-319-32152-3_55

Download citation

DOI: https://doi.org/10.1007/978-3-319-32152-3_55
Published: 02 April 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-32151-6
Online ISBN: 978-3-319-32152-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics