Skip to main content

Design of a Task-Parallel Version of ILUPACK for Graphics Processors

  • Conference paper
  • First Online:

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 697))

Abstract

In many scientific and engineering applications, the solution of large sparse systems of equations is one of the most important stages. For this reason, many libraries have been developed among which ILUPACK stands out due to its efficient inverse-based multilevel preconditioner. Several parallel versions of ILUPACK have been proposed in the past. In particular, two task-parallel versions, for shared and distributed memory platforms, and a GPU accelerated data-parallel variant have been developed to solve symmetric positive definite linear systems. In this work we evaluate the combination of both previously covered approaches. Specifically, we leverage the computational power of one GPU (associated with the data-level parallelism) to accelerate each computation of the multicore (task-parallel) variant of ILUPACK. The performed experimental evaluation shows that our proposal can accelerate the multicore variant when the leaf tasks of the parallel solver offer an acceptable dimension.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   69.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   89.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    http://ilupack.tu-bs.de.

References

  1. Aliaga, J.I., Bollhöfer, M., Dufrechou, E., Ezzatti, P., Quintana-Ortí, E.S.: Leveraging data-parallelism in ILUPACK using graphics processors. In: 2014 IEEE 13th International Symposium on Parallel and Distributed Computing, pp. 119–126. IEEE (2014)

    Google Scholar 

  2. Aliaga, J.I., Bollhöfer, M., Martín, A.F., Quintana-Ortí, E.S.: Parallelization of multilevel preconditioners constructed from inverse-based ILUs on shared-memory multiprocessors. Parallel Comput. Archit. Algorithms Appl. 38, 287–294 (2007)

    MATH  Google Scholar 

  3. Aliaga, J.I., Bollhöfer, M., Martín, A.F., Quintana-Ortí, E.S.: Design, tuning and evaluation of parallel multilevel ILU preconditioners. In: Palma, J.M.L.M., Amestoy, P.R., Daydé, M., Mattoso, M., Lopes, J.C. (eds.) VECPAR 2008. LNCS, vol. 5336, pp. 314–327. Springer, Heidelberg (2008). doi:10.1007/978-3-540-92859-1_28

    Chapter  Google Scholar 

  4. Aliaga, J.I., Bollhöfer, M., Martín, A.F., Quintana-Ortí, E.S.: Exploiting thread-level parallelism in the iterative solution of sparse linear systems. Parallel Comput. 37(3), 183–202 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  5. Aliaga, J.I., Bollhöfer, M., Martín, A.F., Quintana-Ortí, E.S.: Parallelization of multilevel ILU preconditioners on distributed-memory multiprocessors. In: Jónasson, K. (ed.) PARA 2010. LNCS, vol. 7133, pp. 162–172. Springer, Heidelberg (2012). doi:10.1007/978-3-642-28151-8_16

    Chapter  Google Scholar 

  6. Barrett, R., Berry, M.W., Chan, T.F., Demmel, J., Donato, J., Dongarra, J., Eijkhout, V., Pozo, R., Romine, C., Van der Vorst, H.: Templates for the Solution of Linear Systems: Building Blocks for Iterative Methods, vol. 43. SIAM, New Delhi (1994)

    Book  MATH  Google Scholar 

  7. George, T., Gupta, A., Sarin, V.: An empirical analysis of the performance of preconditioners for SPD systems. ACM Trans. Math. Softw. 38(4), 24:1–24:30 (2012)

    Article  MathSciNet  Google Scholar 

  8. Kirk, D.B., Hwu, W.W.: Programming Massively Parallel Processors: A Hands-on Approach. Newnes, Boston (2012)

    Google Scholar 

  9. Saad, Y.: Iterative Methods for Sparse Linear Systems, 2nd edn. SIAM Publications, New Delhi (2003)

    Book  MATH  Google Scholar 

  10. Schenk, O., Wächter, A., Weiser, M.: Inertia-revealing preconditioning for large-scale nonconvex constrained optimization. SIAM J. Sci. Comput. 31(2), 939–960 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  11. Schenk, O., Bollhöfer, M., Römer, R.A.: On large scale diagonalization techniques for the anderson model of localization. SIAM Rev. 50, 91–112 (2008)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgments

The researchers from the Universidad Jaime I were supported by the CICYT project TIN2014-53495R of The researchers from UdelaR were supported by PEDECIBA and CAP-UdelaR Grant.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ernesto Dufrechou .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Aliaga, J.I., Dufrechou, E., Ezzatti, P., Quintana-Ortí, E.S. (2017). Design of a Task-Parallel Version of ILUPACK for Graphics Processors. In: Barrios Hernández, C., Gitler, I., Klapp, J. (eds) High Performance Computing. CARLA 2016. Communications in Computer and Information Science, vol 697. Springer, Cham. https://doi.org/10.1007/978-3-319-57972-6_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-57972-6_7

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-57971-9

  • Online ISBN: 978-3-319-57972-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics