XHYPRE: a reliable parallel numerical algorithm library for solving large-scale sparse linear equations

Li, Chuanying; Graillat, Stef; Quan, Zhe; Gu, Tong-Xiang; Jiang, Hao; Li, Kenli

doi:10.1007/s42514-023-00141-3

XHYPRE: a reliable parallel numerical algorithm library for solving large-scale sparse linear equations

Regular Paper
Published: 04 April 2023

Volume 5, pages 191–209, (2023)
Cite this article

CCF Transactions on High Performance Computing Aims and scope Submit manuscript

Chuanying Li ORCID: orcid.org/0000-0003-3849-6745¹,
Stef Graillat²,
Zhe Quan¹,
Tong-Xiang Gu⁴,
Hao Jiang³ &
…
Kenli Li¹

250 Accesses
1 Citation
Explore all metrics

Abstract

With the rapid development of supercomputers, large-scale computing has become increasingly widespread in various scientific research and engineering fields. Meanwhile, the precision and efficiency of large-scale floating-point arithmetic have always been a research hotspot in high-performance computing. This paper studies the numerical method to solve large-scale sparse linear equations, in which the accumulation of rounding errors during the solution process leads to inaccurate results, and large-scale data makes the solver produce a long running time. For the above issues, we use error-free transformation technology and mixed-precision ideas to construct a reliable parallel numerical algorithm framework based on HYPRE, which solves large-scale sparse linear equations to improve accuracy and accelerate numerical calculations. Moreover, we illustrate the implementation details of our technique by implementing two cases. One is that we use error-free transformation technology to design high-precision iterative algorithms, such as GMRES, PCG, and BICGSTAB, which reduce rounding errors in the calculation process and make the result more accurate. The other is that we propose a mixed-precision iterative algorithm that utilizes low-precision formats to achieve higher computing power and reduce computing time. Experimental results demonstrate that XHYPRE has higher reliability and effectiveness . Our XHYPRE is on average 1.3x faster than HYPRE and reduces the number of iterations to 87.1% on average.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Improving the Performance of the GMRES Method Using Mixed-Precision Techniques

Investigating Performance of the XAMG Library for Solving Linear Systems with Multiple Right-Hand Sides

Parallel Adaptive Deflated GMRES

References

Abdelfattah, A., Anzt, H., Boman, E.G., Carson, E., Cojean, T., Dongarra, J., Fox, A., Gates, M., Higham, N.J., Li, X.S., et al.: A survey of numerical linear algebra methods utilizing mixed-precision arithmetic. Int. J. High Perform. Comput. Appl. 35(4), 344–369 (2021). https://doi.org/10.1177/10943420211003313
Article Google Scholar
Abdulah, S., Cao, Q., Pei, Y., Bosilca, G., Dongarra, J., Genton, M.G., Keyes, D.E., Ltaief, H., Sun, Y.: Accelerating geostatistical modeling and prediction with mixed-precision computations: a high-productivity approach with parsec. IEEE Trans. Parallel Distrib. Syst. 33(4), 964–976 (2022). https://doi.org/10.1109/TPDS.2021.3084071
Article Google Scholar
Baboulin, M., Buttari, A., Dongarra, J., Kurzak, J., Langou, J., Langou, J., Luszczek, P., Tomov, S.: Accelerating scientific computations with mixed precision algorithms. Comput. Phys. Commun. 180(12), 2526–2533 (2009). https://doi.org/10.1016/j.cpc.2008.11.005
Article MATH Google Scholar
Bailey, D.H., Barrio, R., Borwein, J.M.: High-precision computation: Mathematical physics and dynamics. Appl. Math. Comput. 218(20), 10106–10121 (2012). https://doi.org/10.1016/j.amc.2012.03.087
Article MathSciNet MATH Google Scholar
Baker, A.H., Falgout, R.D., Kolev, T.V., Yang, U.M.: Scaling hypre’s multigrid solvers to 100,000 cores. High-Perform. Sci. Comput. (2012). https://doi.org/10.1007/978-1-4471-2437-5_13
Article Google Scholar
Benz, F., Hildebrandt, A., Hack, S.: A dynamic program analysis to find floating-point accuracy problems. ACM SIGPLAN Not. 47(6), 453–462 (2012). https://doi.org/10.1145/2345156.2254118
Article Google Scholar
Blanchard, P., Higham, N.J., Lopez, F., Mary, T., Pranesh, S.: Mixed precision block fused multiply-add: error analysis and application to gpu tensor cores. SIAM J. Sci. Comput. 42(3), 124–141 (2020). https://doi.org/10.1137/19M1289546
Article MathSciNet MATH Google Scholar
Carson, E., Higham, N.J.: Accelerating the solution of linear systems by iterative refinement in three precisions. SIAM J. Sci. Comput. 40(2), 817–847 (2018). https://doi.org/10.1137/17M1140819
Article MathSciNet MATH Google Scholar
Connolly, M.P., Higham, N.J., Mary, T.: Stochastic rounding and its probabilistic backward error analysis. SIAM J. Sci. Comput. 43(1), 566–585 (2021). https://doi.org/10.1137/20M1334796
Article MathSciNet MATH Google Scholar
Cools, S., Yetkin, E.F., Agullo, E., Giraud, L., Vanroose, W.: Analyzing the effect of local rounding error propagation on the maximal attainable accuracy of the pipelined conjugate gradient method. SIAM J. Matrix Anal. Appl. 39(1), 426–450 (2018). https://doi.org/10.1137/17M1117872
Article MathSciNet MATH Google Scholar
de Camargo, A.P.: On the numerical stability of newton’s formula for lagrange interpolation. J. Comput. Appl. Math. 365, 112369 (2020). https://doi.org/10.1016/j.cam.2019.112369
Article MathSciNet MATH Google Scholar
Dekker, T.J.: A floating-point technique for extending the available precision. Numerische Mathematik 18(3), 224–242 (1971). https://doi.org/10.1137/030601818
Article MathSciNet MATH Google Scholar
Delgado Gracia, J.: Compensated evaluation of tensor product surfaces in cagd. Mathematics 8(12), 2219 (2020). https://doi.org/10.3390/math8122219
Article Google Scholar
Du, P., Barrio, R., Jiang, H., Cheng, L.: Accurate quotient-difference algorithm: error analysis, improvements and applications. Appl. Math. Comput. 309, 245–271 (2017). https://doi.org/10.1016/j.amc.2017.04.004
Article MathSciNet MATH Google Scholar
Engwer, C., Falgout, R.D., Yang, U.M.: Stencil computations for pde-based applications with examples from dune and hypre. Concurr. Comput.: Pract. Exp. 29(17), 4097 (2017). https://doi.org/10.1002/cpe.4097
Article Google Scholar
Falgout, R.D., Yang, U.M.: hypre: a library of high performance preconditioners. Int. Conf. Comput. Sci. (2002). https://doi.org/10.1007/3-540-47789-6_66
Article MATH Google Scholar
Falgout, R.D., Jones, J.E., Yang, U.M.: The design and implementation of hypre, a library of parallel high performance preconditioners. Numer. Solut. Partial Diff. Equ. Parallel Comput. (2006). https://doi.org/10.1007/3-540-31619-1_8
Article MATH Google Scholar
Falgout, R.D., Jones, J.E., Yang, U.M.: Conceptual interfaces in hypre. Futur. Gener. Comput. Syst. 22(1–2), 239–251 (2006). https://doi.org/10.1016/j.future.2003.09.006
Article Google Scholar
Gershman, R., Strichman, O.: Cost-effective hyper-resolution for preprocessing cnf formulas. In: International Conference on Theory and Applications of Satisfiability Testing, pp. 423–429 (2005). https://doi.org/10.1007/11499107_34
Graillat, S., Ménissier-Morain, V.: Compensated horner scheme in complex floating point arithmetic. In: Proceedings of the 8th Conference on Real Numbers and Computers, Santiago de Compostela, Spain, pp. 133–146 (2008)
Graillat, S., Jézéquel, F.: Tight interval inclusions with compensated algorithms. IEEE Trans. Comput. 69(12), 1774–1783 (2020). https://doi.org/10.1109/TC.2019.2924005
Article MathSciNet MATH Google Scholar
Graillat, S., Jézéquel, F., Picot, R.: Numerical validation of compensated algorithms with stochastic arithmetic. Appl. Math. Comput. 329, 339–363 (2018). https://doi.org/10.1016/j.amc.2018.02.004
Article MathSciNet MATH Google Scholar
Haidar, A., Bayraktar, H., Tomov, S., Dongarra, J., Higham, N.J.: Mixed-precision iterative refinement using tensor cores on gpus to accelerate solution of linear systems. Proc. R. Soc. A 476(2243), 20200110 (2020). https://doi.org/10.1098/rspa.2020.0110
Article MathSciNet MATH Google Scholar
Hermes, D.: Compensated de casteljau algorithm in k times the working precision. Appl. Math. Comput. 357, 57–74 (2019). https://doi.org/10.1016/j.amc.2019.03.047
Article MathSciNet MATH Google Scholar
Higham, N.J., Mary, T.: A new approach to probabilistic rounding error analysis. SIAM J. Sci. Comput. 41(5), 2815–2835 (2019). https://doi.org/10.1137/18M1226312
Article MathSciNet MATH Google Scholar
https://github.com/solverchallenge/solverchallenge21-tenproblems
https://sparse.tamu.edu/
https://www.mcs.anl.gov/petsc/
Hypre:https://computing.llnl.gov/projects/hypre-scalable-linear-solvers-multigrid-methods
Jiang, H., Graillat, S., Hu, C., Li, S., Liao, X., Cheng, L., Su, F.: Accurate evaluation of the k-th derivative of a polynomial and its application. J. Comput. Appl. Math. 243, 28–47 (2013). https://doi.org/10.1016/j.cam.2012.11.008
Article MathSciNet MATH Google Scholar
Jin, G., Mellor-Crummey, J.: Experiences tuning smg98: a semicoarsening multigrid benchmark based on the hypre library. Proc. 16th Int. Conf. Supercomput. (2002). https://doi.org/10.1145/514191.514233
Article Google Scholar
Knuth, D.E.: Art of Computer Programming, Volume 2: Seminumerical Algorithms, (2014)
Knyazev, A.V., Argentati, M.E., Lashuk, I., Ovtchinnikov, E.E.: Block locally optimal preconditioned eigenvalue xolvers (blopex) in hypre and petsc. SIAM J. Sci. Comput. 29(5), 2224–2239 (2007). https://doi.org/10.1137/060661624
Article MathSciNet MATH Google Scholar
Kurzak, J., Buttari, A., Dongarra, J.: Solving systems of linear equations on the cell processor using cholesky factorization. IEEE Trans. Parallel Distrib. Syst. 19(9), 1175–1186 (2008). https://doi.org/10.1109/TPDS.2007.70813
Article Google Scholar
Lashuk, I., Argentati, M., Ovtchinnikov, E., Knyazev, A.: Preconditioned eigensolver lobpcg in hypre and petsc. Domain Decompos. Methods Sci. Eng. 16, 635–642 (2007). https://doi.org/10.1007/978-3-540-34469-8_79
Article MathSciNet MATH Google Scholar
Li, C., Xiao, X., Du, P., Jiang, H., Barrio, R., Quan, Z., Li, K.: A high-precision dqds algorithm. In: 2021 IEEE 23rd Int Conf on High Performance Computing & Communications; 7th Int Conf on Data Science & Systems; 19th Int Conf on Smart City; 7th Int Conf on Dependability in Sensor, Cloud & Big Data Systems & Application (HPCC/DSS/SmartCity/DependSys), pp. 633–639 (2021). IEEE
Li, C., Du, P., Li, K., Liu, Y., Jiang, H., Quan, Z.: Accurate goertzel algorithm: error analysis, validations and applications. Mathematics 10(11), 1788 (2022)
Article Google Scholar
Li, C., Barrio, R., Xiao, X., Du, P., Jiang, H., Quan, Z., Li, K.: Pacf: A precision-adjustable computational framework for solving singular values. Appl. Math. Comput. 440, 127611 (2023). https://doi.org/10.1016/j.amc.2022.127611
Article MathSciNet MATH Google Scholar
Lindquist, N., Luszczek, P., Dongarra, J.: Accelerating restarted gmres with mixed precision arithmetic. IEEE Trans. Parallel Distrib. Syst. 33(4), 1027–1037 (2022). https://doi.org/10.1109/TPDS.2021.3090757
Article Google Scholar
Mascarenhas, W.F., de Camargo, A.P.: The effects of rounding errors in the nodes on barycentric interpolation. Numerische Mathematik 135(1), 113–141 (2017). https://doi.org/10.1007/s00211-016-0798-x
Article MathSciNet MATH Google Scholar
McCormick, S.F., Benzaken, J., Tamstorf, R.: Algebraic error analysis for mixed-precision multigrid solvers. SIAM J. Sci. Comput. 43(5), 392–419 (2021). https://doi.org/10.1137/20M1348571
Article MathSciNet MATH Google Scholar
Menon, H., Lam, M.O., Osei-Kuffuor, D., Schordan, M., Lloyd, S., Mohror, K., Hittinger, J.: Adapt: Algorithmic differentiation applied to floating-point precision tuning. In: SC18: International Conference for High Performance Computing, Networking, Storage and Analysis, pp. 614–626 (2018). https://doi.org/10.1109/SC.2018.00051
Muller, J.-M., Brisebarre, N., de Dinechin, F., Jeannerod, C.-P., Lefèvre, V., Melquiond, G., Revol, N., Stehlé, D., Torres, S.: The Fused Multiply-Add Instruction, pp. 151–179. Birkhäuser Boston, Boston (2010). https://doi.org/10.1007/978-0-8176-4705-6_5
Muller, J.-M., Brisebarre, N., De Dinechin, F., Jeannerod, C.-P., Lefevre, V., Melquiond, G., Revol, N., Stehlé, D., Torres, S., et al.: Handbook of floating-point Arithmetic. Birkhauser (2018)
Book MATH Google Scholar
Ogita, T., Rump, S.M., Oishi, S.: Accurate sum and dot product. SIAM J. Sci. Comput. 26(6), 1955–1988 (2005). https://doi.org/10.1137/030601818
Article MathSciNet MATH Google Scholar
Ozaki, K., Terao, T., Ogita, T., Katagiri, T.: Verified numerical computations for large-scale linear systems. Appl. Math. 66(2), 269–285 (2021)
Article MathSciNet MATH Google Scholar
Petschow, M., Quintana-Ortí, E.S., Bientinesi, P.: Improved accuracy and parallelism for mrrr-based eigensolvers–a mixed precision approach. SIAM J. Sci. Comput. 36(2), 240–263 (2014). https://doi.org/10.1137/130911561
Article MathSciNet MATH Google Scholar
Sahasrabudhe, D., Berzins, M.: Improving performance of the hypre iterative solver for uintah combustion codes on manycore architectures using mpi endpoints and kernel consolidation. Int. Conf. Comput. Sci. (2020). https://doi.org/10.1007/978-3-030-50371-0_13
Article Google Scholar
Sahasrabudhe, D., Zambre, R., Chandramowlishwaran, A., Berzins, M.: Optimizing the hypre solver for manycore and gpu architectures. J. Comput. Sci. 49, 101279 (2021). https://doi.org/10.1016/j.jocs.2020.101279
Article MathSciNet Google Scholar
Schmidt, J., Berzins, M., Thornock, J., Saad, T., Sutherland, J.: Large scale parallel solution of incompressible flow problems using uintah and hypre. In: 2013 13th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing, pp. 458–465 (2013). https://doi.org/10.1109/CCGrid.2013.10
Sorna, A., Cheng, X., D’Azevedo, E., Won, K., Tomov, S.: Optimizing the fast fourier transform using mixed precision on tensor core hardware. In: 2018 IEEE 25th International Conference on High Performance Computing Workshops (HiPCW), pp. 3–7 (2018). https://doi.org/10.1109/HiPCW.2018.8634417
Stummel, F.: Rounding error analysis of elementary numerical algorithms. Fundam. Numer. Comput. (computer-oriented numerical analysis) (1980). https://doi.org/10.1007/978-3-7091-8577-3_13
Article MathSciNet MATH Google Scholar
Sun, J., Peterson, G.D., Storaasli, O.O.: High-performance mixed-precision linear solver for fpgas. IEEE Trans. Comput. 57(12), 1614–1623 (2008). https://doi.org/10.1109/TC.2008.89
Article MathSciNet MATH Google Scholar
Tan, G., Shui, C., Wang, Y., Yu, X., Yan, Y.: Optimizing the linpack algorithm for large-scale pcie-based cpu-gpu heterogeneous systems. IEEE Trans. Parallel Distrib. Syst. 32(9), 2367–2380 (2021). https://doi.org/10.1109/TPDS.2021.3067731
Article Google Scholar
Wei, J., Chen, M., Wang, L., Ren, P., Lei, Y., Qu, Y., Jiang, Q., Dong, X., Wu, W., Wang, Q., et al.: Status, challenges and trends of data-intensive supercomputing. CCF Trans. High. Perform. Comput. (2022). https://doi.org/10.1007/s42514-022-00109-9
Article Google Scholar
Yang, W., Li, K., Li, K.: A hybrid computing method of spmv on cpu-gpu heterogeneous computing systems. J. Parallel Distrib. Comput. 104, 49–60 (2017). https://doi.org/10.1016/j.jpdc.2016.12.023
Article Google Scholar
Zhang, L., Gong, X., Song, J., Hu, J.: Parallel preconditioned gmres solvers for 3-d helmholtz equations in regional non-hydrostatic atmosphere model. 2008 Int. Conf. Comput.Sci. Softw. Eng. 3, 287–290 (2008). https://doi.org/10.1109/CSSE.2008.898
Article Google Scholar

Download references

Acknowledgements

This work was supported by the NuSCAP (ANR-20-CE48-0014) project of the French National Agency for Research (ANR), the 173 program (2020-JCJQ-ZD-029), Science Challenge Project (TZ2016002).

Author information

Authors and Affiliations

College of Computer Science and Electronic Engineering, Hunan University, Changsha, 410082, China
Chuanying Li, Zhe Quan & Kenli Li
Sorbonne Université, CNRS, LIP6, 10587, F-75005, Paris, France
Stef Graillat
College of Computer, National University of Defense Technology, Changsha, 410073, China
Hao Jiang
Laboratory of Computationary Physics, Institute of Applied Physics and Computational Mathematics, Beijing, 100094, China
Tong-Xiang Gu

Authors

Chuanying Li
View author publications
You can also search for this author in PubMed Google Scholar
Stef Graillat
View author publications
You can also search for this author in PubMed Google Scholar
Zhe Quan
View author publications
You can also search for this author in PubMed Google Scholar
Tong-Xiang Gu
View author publications
You can also search for this author in PubMed Google Scholar
Hao Jiang
View author publications
You can also search for this author in PubMed Google Scholar
Kenli Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hao Jiang.

Ethics declarations

Conflict of interest

On behalf of all authors, the corresponding author states that there is no confict of interest.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Li, C., Graillat, S., Quan, Z. et al. XHYPRE: a reliable parallel numerical algorithm library for solving large-scale sparse linear equations. CCF Trans. HPC 5, 191–209 (2023). https://doi.org/10.1007/s42514-023-00141-3

Download citation

Received: 30 July 2022
Accepted: 14 March 2023
Published: 04 April 2023
Issue Date: June 2023
DOI: https://doi.org/10.1007/s42514-023-00141-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

XHYPRE: a reliable parallel numerical algorithm library for solving large-scale sparse linear equations

Abstract

Access this article

Similar content being viewed by others

Improving the Performance of the GMRES Method Using Mixed-Precision Techniques

Investigating Performance of the XAMG Library for Solving Linear Systems with Multiple Right-Hand Sides

Parallel Adaptive Deflated GMRES

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Rights and permissions

About this article

Cite this article

Keywords

Navigation

XHYPRE: a reliable parallel numerical algorithm library for solving large-scale sparse linear equations

Abstract

Access this article

Similar content being viewed by others

Improving the Performance of the GMRES Method Using Mixed-Precision Techniques

Investigating Performance of the XAMG Library for Solving Linear Systems with Multiple Right-Hand Sides

Parallel Adaptive Deflated GMRES

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation