Skip to main content

Hybrid Programming Model for Implicit PDE Simulations on Multicore Architectures

  • Conference paper
OpenMP in the Petascale Era (IWOMP 2011)

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 6665))

Included in the following conference series:

Abstract

The complexity of programming modern multicore processor based clusters is rapidly rising, with GPUs adding further demand for fine-grained parallelism. This paper analyzes the performance of the hybrid (MPI+OpenMP) programming model in the context of an implicit unstructured mesh CFD code. At the implementation level, the effects of cache locality, update management, work division, and synchronization frequency are studied. The hybrid model presents interesting algorithmic opportunities as well: the convergence of linear system solver is quicker than the pure MPI case since the parallel preconditioner stays stronger when hybrid model is used. This implies significant savings in the cost of communication and synchronization (explicit and implicit). Even though OpenMP based parallelism is easier to implement (with in a subdomain assigned to one MPI process for simplicity), getting good performance needs attention to data partitioning issues similar to those in the message-passing case.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. MPI Forum, http://www.mpi-forum.org

  2. Sahni, O., Zhou, M., Shephard, M.S., Jansen, K.E.: Scalable Implicit Finite Element Solver for Massively Parallel Processing with Demonstration to 160K Cores. In: Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis, SC 2009, pp. 68:1–68:12. ACM, New York (2009)

    Google Scholar 

  3. Kaushik, D., Smith, M., Wollaber, A., Smith, B., Siegel, A., Yang, W.S.: Enabling High-Fidelity Neutron Transport Simulations on Petascale Architectures. In: Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis, SC 2009, pp. 67:1–67:12. ACM, New York (2009)

    Google Scholar 

  4. The OpenMP API specification for parallel programming, http://www.openmp.org

  5. Mallón, D.A., Taboada, G.L., Teijeiro, C., Touriño, J., Fraguela, B.B., Gómez, A., Doallo, R., Mouriño, J.C.: Performance evaluation of MPI, UPC and openMP on multicore architectures. In: Ropo, M., Westerholm, J., Dongarra, J. (eds.) PVM/MPI. LNCS, vol. 5759, pp. 174–184. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  6. Rabenseifner, R., Hager, G., Jost, G.: Hybrid MPI/OpenMP Parallel Programming on Clusters of Multi-Core SMP Nodes. In: 2009 17th Euromicro International Conference on Parallel, Distributed and Network-based Processing, pp. 427–436 (Febraury 2009)

    Google Scholar 

  7. Lusk, E., Chan, A.: Early Experiments with the OpenMP/MPI Hybrid Programming Model. In: Eigenmann, R., de Supinski, B.R. (eds.) IWOMP 2008. LNCS, vol. 5004, pp. 36–47. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  8. Cappello, F., Etiemble, D.: MPI versus MPI+OpenMP on the IBM SP for the NAS Benchmarks. In: ACM/IEEE 2000 Conference on Supercomputing, p. 12 (November 2000)

    Google Scholar 

  9. Gropp, W.D., Kaushik, D.K., Keyes, D.E., Smith, B.F.: High Performance Parallel Implicit CFD. Journal of Parallel Computing 27, 337–362 (2001)

    Article  MATH  Google Scholar 

  10. Cuthill, E., McKee, J.: Reducing the Bandwidth of Sparse Symmetric Matrices. In: Proceedings of the 24th National Conference of the ACM (1969)

    Google Scholar 

  11. Knoll, D.A., Keyes, D.E.: Jacobian-free Newton-Krylov Methods: A Survey of Approaches and Application. Journal of Computational Physics 193, 357–397 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  12. Karypis, G., Kumar, V.: A fast and high quality scheme for partitioning irregular graphs. SIAM Journal of Scientific Computing 20, 359–392 (1999)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Kaushik, D., Keyes, D., Balay, S., Smith, B. (2011). Hybrid Programming Model for Implicit PDE Simulations on Multicore Architectures. In: Chapman, B.M., Gropp, W.D., Kumaran, K., Müller, M.S. (eds) OpenMP in the Petascale Era. IWOMP 2011. Lecture Notes in Computer Science, vol 6665. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-21487-5_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-21487-5_2

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-21486-8

  • Online ISBN: 978-3-642-21487-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics