Detection and correction of silent errors in the conjugate gradient algorithm

Meurant, Gérard

doi:10.1007/s11075-022-01380-1

Detection and correction of silent errors in the conjugate gradient algorithm

Original Paper
Published: 29 July 2022

Volume 92, pages 869–891, (2023)
Cite this article

Numerical Algorithms Aims and scope Submit manuscript

Gérard Meurant ORCID: orcid.org/0000-0002-6036-3482¹

We’re sorry, something doesn't seem to be working properly.

Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

Abstract

We propose a new way to detect and correct silent errors in the conjugate gradient algorithm. The detection criterion is simple, cheap to implement, and can be used at each iteration. This simplifies the correction process. Numerical experiments show that the new criterion is robust and reliable.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Adaptive Filtering Based on Minimum Error Entropy Conjugate Gradient

Article 17 April 2024

Two Improved Nonlinear Conjugate Gradient Methods with the Strong Wolfe Line Search

Article 15 October 2021

Two spectral conjugate gradient methods for unconstrained optimization problems

Article 11 April 2022

Notes

They can be obtained at https://sparse.tamu.edu.

References

Huang, K., Abraham, J.A.: Algorithm-based fault tolerance for matrix operations. IEEE Transactions on Computers 100(6), 518–528 (1984)
Article MATH Google Scholar
Bronevetsky, G., de Supinski, B.: Soft error vulnerability of iterative linear algebra methods. In: Proceedings of the 22nd Annual International Conference on Supercomputing, ICS’08, pp. 155–164. ACM, New York, USA (2008)
Bridges, P.G., Ferreira, K.B., Heroux, M.A., Hoemmen, M.: Fault-tolerant linear solvers via selective reliability. arXiv:1206.1390 (2012)
Shantharam, M., Srinivasmurthy, S., Raghavan, P.: Fault tolerant preconditioned conjugate gradient for sparse linear system solution. In: Proceedings of the 26th ACM International Conference on Supercomputing, ICS’12, New York, USA, pp. 69–78 (2012)
Chen, Z.: Online-ABFT: An online algorithm based fault tolerance scheme for soft error detection in iterative methods. In: 18th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. PPoPP’13, pp. 167–176. ACM, New York, USA (2013)
Elliott, J., Mueller, F., Stoyanov, F., Webster, C.: Quantifying the impact of single bit flips on floating point arithmetic. Report North Carolina State University, Dept. of Computer Science (2013)
Sao, P., Vuduc, R.: Self-stabilizing iterative solvers. In: Proceedings of the Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems (2013)
Rubenstein, Z., Fujita, H., Zheng, Z., Chien, A.: Error checking and snapshot-based recovery in a preconditioned conjugate gradient solver. Technical Report TR-2013-11, Department of Computer Science, University of Chicago (2013)
Elliott, J., Hoemmen, M.: Quantifying the impact of single bit flips in GMRES. In: CSRI Summer Proceedings 2013, pp. 10–31. CSRI (2014)
Elliott, J., Hoemmen, M., Mueller, F.: Evaluating the impact of SDC on the GMRES iterative solver. In: 2014 IEEE 28th International Parallel and Distributed Processing Symposium. IEEE, Piscataway (2014)
Elliott, J., Hoemmen, M., Mueller, F.: Resilience in numerical methods: a position on fault models and methodologies. arXiv:1401.3013 (2014)
Elliott, J., Hoemmen, M., Mueller, F.: A numerical soft fault model for iterative linear solvers. In: Proceedings of the 24th International Symposium on High-Performance Parallel and Distributed Computing (2015)
Elliott, J., Hoemmen, M., Mueller, F.: Exploiting data representation for fault tolerance. Technical Report SAND-2016-0354J, Sandia National Laboratory (2016)
Fasi, M., Langou, J., Robert, Y., Uçar, B.: A backward/forward recovery approach for the preconditioned conjugate gradient method. J. Comput. Sci. 17, 522–534 (2016)
Article MathSciNet Google Scholar
Kestor, G., Mutlu, B.O., Manzano, J., Subasi, O., Unsal, O., Krishnamoorthy, S.: Comparative analysis of soft-error detection strategies: A case study with iterative methods. In: Proceedings of the 15th ACM International Conference on Computing Frontiers, pp. 173–182. ACM, New York, USA (2018)
Mutlu, B.O., Kestor, G., Manzano, J., Unsal, O., Chatterjee, S., Krishnamoorthy, S.: Characterization of the impact of soft errors on iterative methods. In: 2018 IEEE 25th International Conference on High Performance Computing (HiPC), pp. 203–214. IEEE, Piscataway (2018)
Agullo, E., Cools, S., Fatih-Yetkin, E., Giraud, L., Schenkels, N., Vanroose, W.: On soft errors in the Conjugate Gradient method: Sensitivity and robust numerical detection. SIAM J. Sci. Comput. 42(6), 335–358 (2020)
Article MathSciNet MATH Google Scholar
Schöll, A., Braun, C., Kochte, M.A., Wunderlich, H.J.: ow-overhead fault-tolerance for the preconditioned conjugate gradient solver. In: Proceedings of the International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems (DFTS’15)(2015)
Saad, Y.: Practical use of polynomial preconditionings for the conjugate gradient method. SIAM J. Sci. Stat. Comput. 6(4), 865–881 (1985)
Article MathSciNet MATH Google Scholar
Meurant, G.: Multitasking the conjugate gradient method on the CRAY X-MP/48. Parallel Computing 5, 267–280 (1987)
Article MathSciNet MATH Google Scholar
Chen, T., Carson, E.: Predict-and-recompute conjugate gradient variants. SIAM J. Sci. Comput. 42(5), 3084–3108 (2020)
Article MathSciNet MATH Google Scholar
Higham, N.J.: Accuracy and Stability of Numerical Algorithms. SIAM, Philadelphia, USA (2002)
Book MATH Google Scholar
Meurant, G.: The Lanczos and Conjugate Gradient Algorithms, from Theory to Finite Precision Computations. SIAM, Philadelphia, USA (2006)
Book MATH Google Scholar
Meurant, G., Tichý, P.: Approximating the extreme Ritz values and upper bounds for the A-norm of the error in CG. Numer. Algorithms 82(3), 937–968 (2019)
Article MathSciNet MATH Google Scholar

Download references

Acknowledgements

The author thanks Erin Carson for interesting comments and suggestions.

Author information

Authors and Affiliations

30 rue du sergent Bauchat, Paris, 75012, France
Gérard Meurant

Authors

Gérard Meurant
View author publications
You can also search for this author inPubMed Google Scholar

Ethics declarations

Declarations

Data sharing not applicable to this article as no datasets were generated or analyzed during the current study.

Conflict of interest

The author declares no competing interests.

Additional information

This article is dedicated to Claude Brezinski on the occasion of his 80th birthday.

Appendix A: CG local orthogonality

Using (2) it can be shown that

$$\begin{aligned} \vert (r_{k}, r_{k-1})\vert\le & {} \kappa (A)\, \frac{\Vert r_{k-1}\Vert }{\Vert p_{k-1}\Vert } \left[ C_{k-1}^A u \frac{\Vert r_{k-1}\Vert }{\Vert p_{k-1}\Vert }\,\frac{\Vert r_{k-1}\Vert ^2}{\Vert r_{k-2}\Vert ^2} + \Vert r_{k-1}\Vert \,\Vert \delta _{k-2}^p\Vert \right] \nonumber \\&+ \Vert \delta _{k-1}^r\Vert \,\Vert r_{k-1}\Vert , \end{aligned}$$

(A1)

where $\kappa (A)$ is the condition umber of A and $C_{k-1}^A$ is a constant involved in the bound $\vert (Ap_{k-2}, p_{k-1})\vert \le \lambda _n C_{k-1}^A u$ where $\lambda _n$ is the largest eigenvalue of A.

The three terms in the right-hand side of inequality (A1) are small provided that the ratios are bounded and $\kappa (A)$ is not too large. Local orthogonality is, in general, well satisfied.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Meurant, G. Detection and correction of silent errors in the conjugate gradient algorithm. Numer Algor 92, 869–891 (2023). https://doi.org/10.1007/s11075-022-01380-1

Download citation

Received: 24 February 2022
Accepted: 17 July 2022
Published: 29 July 2022
Issue Date: January 2023
DOI: https://doi.org/10.1007/s11075-022-01380-1

Keywords

Mathematics Subject Classification (2010)

Part of a collection:

CIRM21

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Detection and correction of silent errors in the conjugate gradient algorithm

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Adaptive Filtering Based on Minimum Error Entropy Conjugate Gradient

Two Improved Nonlinear Conjugate Gradient Methods with the Strong Wolfe Line Search

Two spectral conjugate gradient methods for unconstrained optimization problems

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Ethics declarations

Declarations

Conflict of interest

Additional information

Appendix A: CG local orthogonality

Appendix A: CG local orthogonality

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification (2010)

Subscribe and save

Buy Now