Application of the ParalleX execution model to stencil-based problems

Heller, T.; Kaiser, H.; Iglberger, K.

doi:10.1007/s00450-012-0217-1

Application of the ParalleX execution model to stencil-based problems

Special Issue Paper
Published: 23 May 2012

Volume 28, pages 253–261, (2013)
Cite this article

Computer Science - Research and Development

T. Heller¹,
H. Kaiser² &
K. Iglberger¹

387 Accesses
29 Citations
Explore all metrics

Abstract

In the prospect of the upcoming exa-scale era with millions of execution units, the question of how to deal with this level of parallelism efficiently is of time-critical relevance. State-of-the-Art parallelization techniques such as OpenMP and MPI are not guaranteed to solve the expected problems of starvation, growing latencies, overheads, and contention. On the other hand, new parallelization paradigms promise to efficiently hide latencies and contain starvation and contention.

In this paper we analyze the performance of one novel parallelization strategy for shared and distributed memory machines. We will focus on shared memory architectures and compare the performance of the ParalleX execution model against the quasi-standard OpenMP for a standard stencil-based problem. We compare in detail the OpenMP implementation of two applications of Jacobi solvers (one based on regular grid and one based on an irregular grid structure) with the corresponding implementation of these applications using HPX (High Performance ParalleX), the first feature-complete, open-source implementation of ParalleX, and analyze the results of both implementations on a multi-socket NUMA node.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multilevel parallelism optimization of stencil computations on SIMDlized NUMA architectures

Article 28 April 2021

Kaifang Zhang, Huayou Su & Yong Dou

Scalable Parallelization of Stencils Using MODA

EPSILOD: efficient parallel skeleton for generic iterative stencil computations in distributed GPUs

Article Open access 14 January 2023

Manuel de Castro, Inmaculada Santamaria-Valenzuela, … Diego R. Llanos

References

Kaiser H, Brodowicz M, Sterling T (2009) ParalleX: An advanced parallel execution model for scaling-impaired applications. In: Parallel processing workshops. IEEE Computer Society, Los Alamitos, pp 394–401. doi:10.1109/ICPPW.2009.14
Google Scholar
Brookes SD, Hoare CAR, Roscoe AW (1984) A theory of communicating sequential processes. J ACM 31(3):560–599
Article MathSciNet MATH Google Scholar
Message Passing Interface Forum (2009) MPI: a message-passing interface standard, version 2.2. High Performance Computing Center Stuttgart (HLRS), Stuttgart, Germany
Dagum L, Menon R (1998) OpenMP: An industry-standard API for shared-memory programming. IEEE Comput Sci Eng 5(1):46–55. doi:10.1109/99.660313
Article Google Scholar
STE||AR Group (2011) Systems Technologies, Emerging Parallelism, and Algorithms Research. http://stellar.cct.lsu.edu
Anderson M, Brodowicz M, Kaiser H, Sterling TL (2011) An application driven analysis of the ParalleX execution model. CoRR abs/1109.5201. http://arxiv.org/abs/1109.5201
Boost: a collection of free peer-reviewed portable C++ source libraries (2011). http://www.boost.org/. http://www.boost.org/
Baker HC, Hewitt C (1977) The incremental garbage collection of processes. In: SIGART bull. ACM Press, New York, pp 55–59. doi:10.1145/872736.806932. 10.1145/872736.806932
Google Scholar
Friedman DP, Wise DS (1976) Cons should not evaluate its arguments. In: ICALP, pp 257–284
Google Scholar
Papadopoulos G, Culler D (1990) Monsoon: an explicit token-store architecture. In: 17th International symposium on computer architecture, no. 18(2) in ACM SIGARCH computer architecture news, May 28–31. ACM Digital Library, Seattle, Washington,
Google Scholar
Barrett R, Berry M, Chan TF, Demmel J, Donato J, Dongarra J, Eijkhout V, Pozo R, Romine C, der Vorst HV (1994) Templates for the solution of linear systems: building blocks for iterative methods, 2nd edn. SIAM, Philadelphia
Book Google Scholar
Hager G, Wellein G (2010) Introduction to high performance computing for scientists and engineers, 1st edn. CRC Press, Boca Raton
Book Google Scholar
Janna C, Ferronato M (2011) Janna/Serena sparse matrix. http://www.cise.ufl.edu/research/sparse/matrices/Janna/Serena.html
Ferronato M, Gambolati G, Janna C, Teatini P (2010) Geomechanical issues of anthropogenic CO₂ sequestration in exploited gas fields. Energy Convers Manag 51(10):1918–1928. doi:10.1007/s00450-012-0217-1
Article Google Scholar
Triebig J (2012) Likwid: Linux tools to support programmers in developing high performance multi-threaded programs. http://code.google.com/p/likwid/
McCalpin JD (2012) STREAM: Sustainable memory bandwidth in high performance computers. A continually updated technical report, University of Virginia, Charlottesville, VA (1991–2007). http://www.cs.virginia.edu/stream/

Download references

Acknowledgements

We thank Matthew Anderson for helpful discussions and for preparing the irregular grids we used. We thank Georg Hager for optimizing the OpenMP usage and for suggesting the applied performance model. We acknowledge the support from the Center for Computation and Technology (CCT) at Louisiana State University (LSU) and from NSF grants (1029161, 1117470) to LSU.

Author information

Authors and Affiliations

Friedrich-Alexander University Erlangen-Nuremberg, 91058, Erlangen, Germany
T. Heller & K. Iglberger
Center for Computation and Technology, Louisiana State University, Baton Rouge, LA, 70803, USA
H. Kaiser

Authors

T. Heller
View author publications
You can also search for this author in PubMed Google Scholar
H. Kaiser
View author publications
You can also search for this author in PubMed Google Scholar
K. Iglberger
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to T. Heller.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Heller, T., Kaiser, H. & Iglberger, K. Application of the ParalleX execution model to stencil-based problems. Comput Sci Res Dev 28, 253–261 (2013). https://doi.org/10.1007/s00450-012-0217-1

Download citation

Published: 23 May 2012
Issue Date: May 2013
DOI: https://doi.org/10.1007/s00450-012-0217-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Application of the ParalleX execution model to stencil-based problems

Abstract

Access this article

Similar content being viewed by others

Multilevel parallelism optimization of stencil computations on SIMDlized NUMA architectures

Scalable Parallelization of Stencils Using MODA

EPSILOD: efficient parallel skeleton for generic iterative stencil computations in distributed GPUs

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Application of the ParalleX execution model to stencil-based problems

Abstract

Access this article

Similar content being viewed by others

Multilevel parallelism optimization of stencil computations on SIMDlized NUMA architectures

Scalable Parallelization of Stencils Using MODA

EPSILOD: efficient parallel skeleton for generic iterative stencil computations in distributed GPUs

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation