Skip to main content
Log in

Application of the ParalleX execution model to stencil-based problems

  • Special Issue Paper
  • Published:
Computer Science - Research and Development

Abstract

In the prospect of the upcoming exa-scale era with millions of execution units, the question of how to deal with this level of parallelism efficiently is of time-critical relevance. State-of-the-Art parallelization techniques such as OpenMP and MPI are not guaranteed to solve the expected problems of starvation, growing latencies, overheads, and contention. On the other hand, new parallelization paradigms promise to efficiently hide latencies and contain starvation and contention.

In this paper we analyze the performance of one novel parallelization strategy for shared and distributed memory machines. We will focus on shared memory architectures and compare the performance of the ParalleX execution model against the quasi-standard OpenMP for a standard stencil-based problem. We compare in detail the OpenMP implementation of two applications of Jacobi solvers (one based on regular grid and one based on an irregular grid structure) with the corresponding implementation of these applications using HPX (High Performance ParalleX), the first feature-complete, open-source implementation of ParalleX, and analyze the results of both implementations on a multi-socket NUMA node.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Listing 1
Listing 2
Listing 3
Listing 4
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  1. Kaiser H, Brodowicz M, Sterling T (2009) ParalleX: An advanced parallel execution model for scaling-impaired applications. In: Parallel processing workshops. IEEE Computer Society, Los Alamitos, pp 394–401. doi:10.1109/ICPPW.2009.14

    Google Scholar 

  2. Brookes SD, Hoare CAR, Roscoe AW (1984) A theory of communicating sequential processes. J ACM 31(3):560–599

    Article  MathSciNet  MATH  Google Scholar 

  3. Message Passing Interface Forum (2009) MPI: a message-passing interface standard, version 2.2. High Performance Computing Center Stuttgart (HLRS), Stuttgart, Germany

  4. Dagum L, Menon R (1998) OpenMP: An industry-standard API for shared-memory programming. IEEE Comput Sci Eng 5(1):46–55. doi:10.1109/99.660313

    Article  Google Scholar 

  5. STE||AR Group (2011) Systems Technologies, Emerging Parallelism, and Algorithms Research. http://stellar.cct.lsu.edu

  6. Anderson M, Brodowicz M, Kaiser H, Sterling TL (2011) An application driven analysis of the ParalleX execution model. CoRR abs/1109.5201. http://arxiv.org/abs/1109.5201

  7. Boost: a collection of free peer-reviewed portable C++ source libraries (2011). http://www.boost.org/. http://www.boost.org/

  8. Baker HC, Hewitt C (1977) The incremental garbage collection of processes. In: SIGART bull. ACM Press, New York, pp 55–59. doi:10.1145/872736.806932. 10.1145/872736.806932

    Google Scholar 

  9. Friedman DP, Wise DS (1976) Cons should not evaluate its arguments. In: ICALP, pp 257–284

    Google Scholar 

  10. Papadopoulos G, Culler D (1990) Monsoon: an explicit token-store architecture. In: 17th International symposium on computer architecture, no. 18(2) in ACM SIGARCH computer architecture news, May 28–31. ACM Digital Library, Seattle, Washington,

    Google Scholar 

  11. Barrett R, Berry M, Chan TF, Demmel J, Donato J, Dongarra J, Eijkhout V, Pozo R, Romine C, der Vorst HV (1994) Templates for the solution of linear systems: building blocks for iterative methods, 2nd edn. SIAM, Philadelphia

    Book  Google Scholar 

  12. Hager G, Wellein G (2010) Introduction to high performance computing for scientists and engineers, 1st edn. CRC Press, Boca Raton

    Book  Google Scholar 

  13. Janna C, Ferronato M (2011) Janna/Serena sparse matrix. http://www.cise.ufl.edu/research/sparse/matrices/Janna/Serena.html

  14. Ferronato M, Gambolati G, Janna C, Teatini P (2010) Geomechanical issues of anthropogenic CO2 sequestration in exploited gas fields. Energy Convers Manag 51(10):1918–1928. doi:10.1007/s00450-012-0217-1

    Article  Google Scholar 

  15. Triebig J (2012) Likwid: Linux tools to support programmers in developing high performance multi-threaded programs. http://code.google.com/p/likwid/

  16. McCalpin JD (2012) STREAM: Sustainable memory bandwidth in high performance computers. A continually updated technical report, University of Virginia, Charlottesville, VA (1991–2007). http://www.cs.virginia.edu/stream/

Download references

Acknowledgements

We thank Matthew Anderson for helpful discussions and for preparing the irregular grids we used. We thank Georg Hager for optimizing the OpenMP usage and for suggesting the applied performance model. We acknowledge the support from the Center for Computation and Technology (CCT) at Louisiana State University (LSU) and from NSF grants (1029161, 1117470) to LSU.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to T. Heller.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Heller, T., Kaiser, H. & Iglberger, K. Application of the ParalleX execution model to stencil-based problems. Comput Sci Res Dev 28, 253–261 (2013). https://doi.org/10.1007/s00450-012-0217-1

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00450-012-0217-1

Keywords

Navigation