Skip to main content

Hybrid Parallel ILU Preconditioner in Linear Solver Library GaspiLS

  • Conference paper
  • First Online:
High Performance Computing (ISC High Performance 2022)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13289))

Included in the following conference series:

  • 1215 Accesses

Abstract

Krylov subspace solvers such as GMRES and preconditioners such as incomplete LU (ILU) are the most commonly used methods to solve general-purpose, large-scale linear systems in simulations efficiently. Parallel Krylov subspace solvers and preconditioners with good scalability features are required to exploit the increasing parallelism provided by modern hardware fully. As such, they are crucial for productivity. They provide a high-level abstraction to the details of a complex hybrid parallel implementation which is easy to use for the domain expert. However, the ILU factorization and the subsequent triangular solve are sequential in their basic form. We use a multilevel nested dissection (MLND) ordering to resolve that issue and expose some parallelism. We investigate the parallel efficiency of a hybrid parallel ILU preconditioner that combines a restricted additive Schwarz (RAS) method on the process level with a shared memory parallel MLND Crout ILU method on the thread level. We employ the PGAS based programming model GASPI to efficiently implement the data exchange across processes. We demonstrate the scalability of our approach for the convection-diffusion problem as a representative of a large class of engineering problems up to 64 sockets (1280 cores) and show comparable baseline performance against the linear solver library PETSc. The RAS preconditioned GMRES solver achieves about 80% parallel efficiency on 1280 cores. Our implementation provides a generic, algebraic, scalable, and efficient preconditioner that enables productivity for the domain expert in solving large-scale sparse linear systems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 69.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 89.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Threads and PETSc (2021). https://petsc.org/release/miscellaneous/threads/. Accessed 14 Dec 2021

  2. Agullo, E., Giraud, L., Guermouche, A., Haidar, A., Roman, J.: MaPHyS or the development of a parallel algebraic domain decomposition solver in the course of the solstice project. In: Sparse Days 2010 Meeting at CERFACS (2010)

    Google Scholar 

  3. Aliaga, J.I., Bollhöfer, M., Martı, A.F., Quintana-Ortı, E.S., et al.: Exploiting thread-level parallelism in the iterative solution of sparse linear systems. Parallel Comput. 37(3), 183–202 (2011)

    Article  MATH  Google Scholar 

  4. Aliaga, J.I., Bollhöfer, M., Martín, A.F., Quintana-Ortí, E.S.: Design, tuning and evaluation of parallel multilevel ILU preconditioners. In: Palma, J.M.L.M., Amestoy, P.R., Daydé, M., Mattoso, M., Lopes, J.C. (eds.) VECPAR 2008. LNCS, vol. 5336, pp. 314–327. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-92859-1_28

    Chapter  Google Scholar 

  5. Balay, S., et al.: Petsc users manual (2019)

    Google Scholar 

  6. Belli, R., Hoefler, T.: Notified access: extending remote memory access programming models for producer-consumer synchronization. In: 2015 IEEE International Parallel and Distributed Processing Symposium, pp. 871–881. IEEE (2015)

    Google Scholar 

  7. Bollhöfer, M., Saad, Y., Schenk, O.: Ilupack-preconditioning software package. Release 2 (2006). http://ilupack.tu-bs.de/

  8. Cai, X.C., Sarkis, M.: A restricted additive Schwarz preconditioner for general sparse linear systems. SIAM J. Sci. Comput. 21(2), 792–797 (1999)

    Article  MATH  Google Scholar 

  9. Chen, Q., Ghai, A., Jiao, X.: HILUCSI: simple, robust, and fast multilevel ILU for large-scale saddle-point problems from PDEs. Numer. Linear Algebra Appl. 28, e2400 (2021)

    Google Scholar 

  10. Chow, E., Patel, A.: Fine-grained parallel incomplete LU factorization. SIAM J. Sci. Comput. 37(2), C169–C193 (2015)

    Article  MATH  Google Scholar 

  11. Efstathiou, E., Gander, M.J.: Why restricted additive Schwarz converges faster than additive Schwarz. BIT Numer. Math. 43(5), 945–959 (2003)

    Article  MATH  Google Scholar 

  12. Falgout, R.D., Jones, J.E., Yang, U.M.: The design and implementation of hypre, a library of parallel high performance preconditioners. In: Bruaset, A.M., Tveito, A. (eds.) Numerical Solution of Partial Differential Equations on Parallel Computers, pp. 267–294. Springer, Berlin (2006)

    Chapter  Google Scholar 

  13. Forum, G.: GASPI forum - forum of the PGAS API GASPI (2020). http://www.gaspi.de

  14. Ghai, A., Jiao, X.: Robust optimal-complexity multilevel ilu for predominantly symmetric systems. arXiv preprint arXiv:1901.03249 (2019)

  15. Giraud, L., Tuminaro, R.: Algebraic domain decomposition preconditioners. In: Magoules, F. (ed.) Mesh Partitioning Techniques And Domain Decomposition Methods, pp. 187–216. Saxe-Coburg Publications, Kippen (2006)

    Google Scholar 

  16. Grünewald, D., Simmendinger, C.: The GASPI API specification and its implementation GPI 2.0. In: Proceedings of the 7th International Conference on PGAS Programming Models, vol. 243 (2013)

    Google Scholar 

  17. Heroux, M.A., Bartlett, R.A., Howle, V.E., Hoekstra, R.J., Hu, J.J., Kolda, T.G., Lehoucq, R.B., Long, K.R., Pawlowski, R.P., Phipps, E.T., et al.: An overview of the trilinos project. ACM Trans. Math. Softw. 31(3), 397–423 (2005)

    Article  MATH  Google Scholar 

  18. ITWM Fraunhofer: GaspiLS - a linear solver for the Exascale Era (2020). https://www.gaspils.de

  19. ITWM Fraunhofe: GPI-2 - Programming next generation supercomputers (2020). http://www.gpi-site.com

  20. Karypis, G., Kumar, V.: METIS: A software package for partitioning unstructured graphs, partitioning meshes, and computing fill-reducing orderings of sparse matrices. Technical Report; 97-061 (1997)

    Google Scholar 

  21. Karypis, G., Kumar, V.: A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM J. Sci. Comput. 20(1), 359–392 (1998)

    Article  MATH  Google Scholar 

  22. LaSalle, D., Karypis, G.: Efficient nested dissection for multicore architectures. In: Träff, J.L., Hunold, S., Versaci, F. (eds.) Euro-Par 2015. LNCS, vol. 9233, pp. 467–478. Springer, Heidelberg (2015). https://doi.org/10.1007/978-3-662-48096-0_36

    Chapter  Google Scholar 

  23. Leicht, T., Jägersküpper, J., Vollmer, D., Schwöppe, A., Hartmann, R., Fiedler, J., Schlauch, T.: DLR-project digital-X-next generation CFD solver ’flucs’ (2016)

    Google Scholar 

  24. Li, N., Saad, Y., Chow, E.: Crout versions of ILU for general sparse matrices. SIAM J. Sci. Comput. 25(2), 716–728 (2003)

    Article  MATH  Google Scholar 

  25. Prokopenko, A., Siefert, C.M., Hu, J.J., Hoemmen, M., Klinvex, A.: Ifpack2 User’s Guide 1.0. Tech. Rep. SAND2016-5338, Sandia National Labs (2016)

    Google Scholar 

  26. Rajamanickam, S., Boman, E.G., Heroux, M.A.: ShyLU: a hybrid-hybrid solver for multicore platforms. In: 2012 IEEE 26th International Parallel and Distributed Processing Symposium, pp. 631–643 (2012). https://doi.org/10.1109/IPDPS.2012.64

  27. Ram, R., Grünewald, D., Gauger, N.R.: Data structures to implement the Sparse Vector in Crout ILU preconditioner (2019), Sparse Days 2019

    Google Scholar 

  28. Simmendinger, C., Rahn, M., Gruenewald, D.: The GASPI API: a failure tolerant PGAS API for Asynchronous Dataflow on heterogeneous architectures. In: Resch, M., Bez, W., Focht, E., Kobayashi, H., Patel, N. (eds.) Sustained Simulation Performance 2014, pp. 17–32. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-10626-7_2

  29. Stoyanov, D., Pfreundt, F.J.: Hybrid-parallel sparse matrix-vector multiplication and iterative linear solvers with the communication library GPI. WSEAS Trans. Inf. Sci. Appl. 11 (2014)

    Google Scholar 

  30. Yamazaki, I., Ng, E., Li, X.: Pdslin user guide. Tech. rep., Lawrence Berkeley National Lab. (LBNL), Berkeley, CA, USA (2011)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Raju Ram .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Ram, R., Grünewald, D., Gauger, N.R. (2022). Hybrid Parallel ILU Preconditioner in Linear Solver Library GaspiLS. In: Varbanescu, AL., Bhatele, A., Luszczek, P., Marc, B. (eds) High Performance Computing. ISC High Performance 2022. Lecture Notes in Computer Science, vol 13289. Springer, Cham. https://doi.org/10.1007/978-3-031-07312-0_17

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-07312-0_17

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-07311-3

  • Online ISBN: 978-3-031-07312-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics