Abstract
The multifrontal method is a well-established approach to parallel sparse direct solvers of linear algebraic equations systems with sparse symmetric positive-definite matrices. This paper discusses the approaches and challenges of scalable parallel implementation of the numerical phase of the multifrontal method for shared memory systems based on high-end server CPUs with dozens of cores. The commonly used parallelization schemes are often guided by an elimination tree, containing information about dependencies between logical tasks in a computational loop of the method. We consider a dynamic two-level scheme for the organization of parallel computations. This scheme employs the task-based model with dynamic switching between solving relatively small tasks in parallel and using parallel functions of BLAS for relatively large tasks. There are several problems with the implementation of this scheme, including time-consuming synchronizations and the need for smart memory management. We found a way to improve performance and scaling efficiency using the model of parallelism and memory management tools from the Threading Building Blocks library. Experiments on large symmetric matrices from the SuiteSparse Matrix Collection show that our implementation is competitive with the commercial direct sparse solver Intel MKL PARDISO.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Agullo, E., Buttari, A., Guermouche, A., Lopez, F.: Implementing multifrontal sparse solvers for multicore architectures with sequential task flow runtime systems. ACM Trans. Math. Softw. (TOMS) 43(2), 13 (2016). https://doi.org/10.1145/2898348
Amestoy, P.R., Duff, I.S., L’Excellent, J.Y., Koster, J.: A fully asynchronous multifrontal solver using distributed dynamic scheduling. SIAM J. Matrix Anal. Appl. 23(1), 15–41 (2001). https://doi.org/10.1137/s0895479899358194
Amestoy, P.R., Duff, I.S., L’excellent, J.Y.: Multifrontal parallel distributed symmetric and unsymmetric solvers. Comput. Methods Appl. Mech. Eng. 184(2–4), 501–520 (2000). https://doi.org/10.1016/S0045-7825(99)00242-X
Chen, Y., Davis, T.A., Hager, W.W., Rajamanickam, S.: Algorithm 887: CHOLMOD, supernodal sparse Cholesky factorization and update/downdate. ACM Trans. Math. Softw. (TOMS) 35(3), 22 (2008). https://doi.org/10.1145/1391989.1391995
Davis, T.A.: Direct Methods for Sparse Linear Systems, vol. 2. Siam, Philadelphia (2006)
Duff, I.S., Erisman, A.M., Reid, J.K.: Direct Methods for Sparse Matrices. Oxford University Press, Oxford (2017)
Duff, I.S., Reid, J.K.: The multifrontal solution of indefinite sparse symmetric linear. ACM Trans. Math. Softw. (TOMS) 9(3), 302–325 (1983). https://doi.org/10.1145/356044.356047
Duff, I.S., Reid, J.K.: The multifrontal solution of unsymmetric sets of linear equations. SIAM J. Sci. Stat. Comput. 5(3), 633–641 (1984). https://doi.org/10.1137/0905045
Duff, I., Hogg, J., Lopez, F.: A new sparse symmetric indefinite solver using A Posteriori Threshold Pivoting (2018)
Duff, I., Lopez, F.: Experiments with sparse Cholesky using a parametrized task graph implementation. In: Wyrzykowski, R., Dongarra, J., Deelman, E., Karczewski, K. (eds.) PPAM 2017. LNCS, vol. 10777, pp. 197–206. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-78024-5_18
Hogg, J.D., Reid, J.K., Scott, J.A.: Design of a multicore sparse Cholesky factorization using DAGs. SIAM J. Sci. Comput. 32(6), 3627–3649 (2010). https://doi.org/10.1137/090757216
Kalinkin, A., Anders, A., Anders, R.: Intel® math kernel library parallel direct sparse solver for clusters. In: EAGE Workshop on High Performance Computing for Upstream (2014). https://doi.org/10.3997/2214-4609.20141926
Karypis, G., Kumar, V.: A parallel algorithm for multilevel graph partitioning and sparse matrix ordering. J. Parallel Distrib. Comput. 48(1), 71–95 (1998). https://doi.org/10.1006/jpdc.1997.1403
L’Excellent, J.Y.: Multifrontal Methods: Parallelism, Memory Usage and Numerical Aspects. Ph.D. thesis, Ecole normale superieure de lyon-ENS LYON (2012)
L’Excellent, J.Y., Sid-Lakhdar, W.M.: A study of shared-memory parallelism in a multifrontal solver. Parallel Comput. 40(3–4), 34–46 (2014). https://doi.org/10.1016/j.parco.2014.02.003
LaSalle, D., Karypis, G.: Efficient nested dissection for multicore architectures. In: Träff, J.L., Hunold, S., Versaci, F. (eds.) Euro-Par 2015. LNCS, vol. 9233, pp. 467–478. Springer, Heidelberg (2015). https://doi.org/10.1007/978-3-662-48096-0_36
Lebedev, S., Akhmedzhanov, D., Kozinov, E., Meyerov, I., Pirova, A., Sysoyev, A.: Dynamic parallelization strategies for multifrontal sparse Cholesky factorization. In: Malyshkin, V. (ed.) International Conference on Parallel Computing Technologies, pp. 68–79. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-21909-7_7
Lebedev, S., Meyerov, I., Kozinov, E., Akhmedzhanov, D., Pirova, A., Sysoyev, A.: Two-level parallel strategy for multifrontal sparse Cholesky factorization. Vestnik UGATU 19(3(69)), 178–189 (2015)
Li, X.S., Demmel, J.W.: SuperLU_DIST: a scalable distributed-memory sparse direct solver for unsymmetric linear systems. ACM Trans. Math. Softw. (TOMS) 29(2), 110–140 (2003). https://doi.org/10.1145/779359.779361
Liu, J.W.: The multifrontal method for sparse matrix solution: theory and practice. SIAM Rev. 34(1), 82–109 (1992). https://doi.org/10.1137/1034004
Pellegrini, F.: Scotch and libScotch 6.0 User’s Guide. Technical report, LaBRI (2012)
Pirova, A., Meyerov, I., Kozinov, E., Lebedev, S.: PMORSy: parallel sparse matrix ordering software for fill-in minimization. Optim. Methods Softw. 32(2), 274–289 (2017). https://doi.org/10.1080/10556788.2016.1193177
Schreiber, R.: A new implementation of sparse Gaussian elimination. ACM Trans. Math. Softw. (TOMS) 8(3), 256–276 (1982). https://doi.org/10.1145/356004.356006
Sid-Lakhdar, W.M.: Scaling the solution of large sparse linear systems using multifrontal methods on hybrid shared-distributed memory architectures. PhD Thesis, prepared at ENS Lyon (2014)
Tang, M., Gadou, M., Rennich, S., Davis, T.A., Ranka, S.: Optimized sparse Cholesky factorization on hybrid multicore architectures. J. Comput. Sci. 26, 246–253 (2018). https://doi.org/10.1016/j.jocs.2018.04.008
The SuiteSparse matrix collection. https://sparse.tamu.edu
The Threading Building Blocks library. https://www.threadingbuildingblocks.org
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Povelikin, R., Lebedev, S., Meyerov, I. (2019). Multithreaded Multifrontal Sparse Cholesky Factorization Using Threading Building Blocks. In: Voevodin, V., Sobolev, S. (eds) Supercomputing. RuSCDays 2019. Communications in Computer and Information Science, vol 1129. Springer, Cham. https://doi.org/10.1007/978-3-030-36592-9_7
Download citation
DOI: https://doi.org/10.1007/978-3-030-36592-9_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-36591-2
Online ISBN: 978-3-030-36592-9
eBook Packages: Computer ScienceComputer Science (R0)