Iterative Refinement with Low-Precision Posit Arithmetic

Quinlan, James; Omtzigt, E. Theodore L.

doi:10.1007/978-3-031-72709-2_3

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14666))

Included in the following conference series:

Conference on Next Generation Arithmetic

80 Accesses

Abstract

This study examines the mixed-precision iterative refinement technique using posit numbers instead of standard IEEE floating-point. The process is applied to a general linear system $Ax = b$ where A is a large sparse matrix. Multiple scaling strategies, including row and column equilibration, scale matrix entries into higher-density regions of machine numbers before performing the $O(n^3)$ factorization operation. Low-precision LU factorization followed by forward/backward substitution yields an initial estimate. The residual $r = b - Ax$ is computed to a higher precision with a deferred rounding mechanism, then used as the right-hand side in a new linear system $Ac = r$. The corrector c is calculated and used to refine the previous solution. Results show a 16-bit posit configuration coupled with equilibration yields accuracy comparable to IEEE half-precision (fp16), showing potential for balancing efficiency and accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Reduced-Precision and Reduced-Exponent Formats for Accelerating Adaptive Precision Sparse Matrix–Vector Product

A mixed precision LOBPCG algorithm

Article 11 May 2023

Fixed-point iterative linear inverse solver with extended precision

Article Open access 30 March 2023

Notes

1.
There is always at least one regime bit, and for $n > 2$, there are at least two bits. For $n>2$, $1 \le r \le n-1$, therefore $ -(n-1) \le k \le n-2$.
2.
Formerly known as the University of Florida Sparse Matrix Collection.

References

Al-Kurdi, A., Kincaid, D.R.: LU-decomposition with iterative refinement for solving sparse linear systems. J. Comput. Appl. Math. 185(2), 391–403 (2006)
Article MathSciNet Google Scholar
Anderson, E., et al.: LAPACK users’ guide. SIAM (1999)
Google Scholar
Baboulin, M., et al.: Accelerating scientific computations with mixed precision algorithms. Comput. Phys. Commun. 180(12), 2526–2533 (2009)
Article Google Scholar
Bailey, D.H., Borwein, J.M.: High-precision arithmetic in mathematical physics. Mathematics 3(2), 337–367 (2015)
Article Google Scholar
Barrett, R., et al.: Templates for the solution of linear systems: building blocks for iterative methods. SIAM (1994)
Google Scholar
Bauer, F.L.: Optimally scaled matrices. Numer. Math. 5(1), 73–87 (1963)
Article MathSciNet Google Scholar
Buoncristiani, N., Shah, S., Donofrio, D., Shalf, J.: Evaluating the numerical stability of posit arithmetic. In: 2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS), pp. 612–621. IEEE (2020)
Google Scholar
Carmichael, Z., Langroudi, H.F., Khazanov, C., Lillie, J., Gustafson, J.L., Kudithipudi, D.: Deep positron: a deep neural network using the posit number system. In: 2019 Design, Automation & Test in Europe Conference & Exhibition (DATE), pp. 1421–1426. IEEE (2019)
Google Scholar
Carson, E., Higham, N.J.: A new analysis of iterative refinement and its application to accurate solution of ill-conditioned sparse linear systems. SIAM J. Sci. Comput. 39(6), A2834–A2856 (2017)
Article MathSciNet Google Scholar
Carson, E., Higham, N.J.: Accelerating the solution of linear systems by iterative refinement in three precisions. SIAM J. Sci. Comput. 40(2), A817–A847 (2018)
Article MathSciNet Google Scholar
Cline, A.K., Moler, C.B., Stewart, G.W., Wilkinson, J.H.: An estimate for the condition number of a matrix. SIAM J. Numer. Anal. 16(2), 368–375 (1979)
Article MathSciNet Google Scholar
Cococcioni, M., Rossi, F., Ruffaldi, E., Saponara, S.: Fast deep neural networks for image processing using posits and arm scalable vector extension. J. Real-Time Image Proc. 17(3), 759–771 (2020)
Article Google Scholar
Davis, T.A., Hu, Y.: The university of Florida sparse matrix collection. ACM Trans. Math. Softw. (TOMS) 38(1), 1–25 (2011)
MathSciNet Google Scholar
Dawson, A., Düben, P.D., MacLeod, D.A., Palmer, T.N.: Reliable low precision simulations in land surface models. Clim. Dyn. 51(7), 2657–2666 (2018)
Article Google Scholar
Demmel, J., Hida, Y., Kahan, W., Li, X.S., Mukherjee, S., Riedy, E.J.: Error bounds from extra-precise iterative refinement. ACM Trans. Math. Softw. (TOMS) 32(2), 325–351 (2006)
Article MathSciNet Google Scholar
Elble, J.M., Sahinidis, N.V.: Scaling linear optimization problems prior to application of the simplex method. Comput. Optim. Appl. 52, 345–371 (2012)
Article MathSciNet Google Scholar
Feldman, M.: Fujitsu reveals details of processor that will power Post-K supercomputer (2018). Accessed 26 Mar 2019
Google Scholar
Forsythe, G.E., Moler, C.B.: Computer Solution of Linear Algebraic Systems. Prentice-Hall, Englewood Cliffs (1967)
Google Scholar
Golub, G.H., Van Loan, C.F.: Matrix Computations. Mathematical Sciences. Johns Hopkins University Press, Baltimore (1996)
Google Scholar
Golub, G.H., Van Loan, C.F.: Matrix Computations. JHU Press, Baltimore (2013)
Google Scholar
Posit Working Group: Standard for posit arithmetic. Technical report, National Supercomputing Centre (NSCC) Singapore (2022)
Google Scholar
Gupta, S., Agrawal, A., Gopalakrishnan, K., Narayanan, P.: Deep learning with limited numerical precision. In: International Conference on Machine Learning, pp. 1737–1746. PMLR (2015)
Google Scholar
Gustafson, J.L., Yonemoto, I.T.: Beating floating point at its own game: posit arithmetic. Supercomput. Front. Innov. 4(2), 71–86 (2017)
Google Scholar
Haidar, A., et al.: The design of fast and energy-efficient linear solvers: on the potential of half-precision arithmetic and iterative refinement techniques. In: Shi, Y., et al. (eds.) ICCS 2018. LNCS, vol. 10860, pp. 586–600. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-93698-7_45
Chapter Google Scholar
Haidar, A., Tomov, S., Dongarra, J., Higham, N.J.: Harnessing GPU tensor cores for fast FP16 arithmetic to speed up mixed-precision iterative refinement solvers. In: SC18: International Conference for High Performance Computing, Networking, Storage and Analysis, pp. 603–613. IEEE (2018)
Google Scholar
Haidar, A., Wu, P., Tomov, S., Dongarra, J.: Investigating half precision arithmetic to accelerate dense linear system solvers. In: Proceedings of the 8th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems, pp. 1–8 (2017)
Google Scholar
He, Y., Ding, C.H.: Using accurate arithmetics to improve numerical reproducibility and stability in parallel applications. J. Supercomput. 18(3), 259–277 (2001)
Article Google Scholar
Higham, N.J.: Iterative refinement for linear systems and lapack. IMA J. Numer. Anal. 17(4), 495–509 (1997)
Article MathSciNet Google Scholar
Higham, N.J., Mary, T.: A new preconditioner that exploits low-rank approximations to factorization error. SIAM J. Sci. Comput. 41(1), A59–A82 (2019)
Article MathSciNet Google Scholar
Higham, N.J., Pranesh, S., Zounon, M.: Squeezing a matrix into half-precision, with an application to solving linear systems. SIAM J. Sci. Comput. 41(4), A2536–A2551 (2019)
Article MathSciNet Google Scholar
Hook, J., Pestana, J., Tisseur, F., Hogg, J.: Max-balanced hungarian scalings. SIAM J. Matrix Anal. Appl. 40(1), 320–346 (2019)
Article MathSciNet Google Scholar
Intel Corporation: BFLOAT16 - Hardware Numerics Definition (2018)
Google Scholar
Kharya, P.: Tensorfloat-32 in the A100 GPU accelerates AI training HPC up to 20x. NVIDIA Corporation, Technical report (2020)
Google Scholar
Kirdani-Ryan, M., Lim, K., Smith, G., Petrisko, D.: Well rounded: visualizing floating point representations (2019). https://cse512-19s.github.io/FP-Well-Rounded. Accessed 08 Oct 2023
Kulisch, U.: Grundlagen des numerischen rechnens-mathematische begründung der rechnerarithmetik, reihe informatik (1976)
Google Scholar
Langou, J., Langou, J., Luszczek, P., Kurzak, J., Buttari, A., Dongarra, J.: Exploiting the performance of 32 bit floating point arithmetic in obtaining 64 bit accuracy (revisiting iterative refinement for linear systems). In: SC’06: Proceedings of the 2006 ACM/IEEE Conference on Supercomputing, p. 50. IEEE (2006)
Google Scholar
Leon, S.J., De Pillis, L.: Linear Algebra with Applications. Pearson, London (2020)
Google Scholar
Lindquist, N., Luszczek, P., Dongarra, J.: Improving the performance of the GMRES method using mixed-precision techniques. In: Nichols, J., Verastegui, B., Maccabe, A.B., Hernandez, O., Parete-Koon, S., Ahearn, T. (eds.) SMC 2020. CCIS, vol. 1315, pp. 51–66. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-63393-6_4
Chapter Google Scholar
Lu, J., Fang, C., Xu, M., Lin, J., Wang, Z.: Evaluations on deep neural networks training using posit number system. IEEE Trans. Comput. 70(2), 174–187 (2020)
Article Google Scholar
Ma, D., Saunders, M.A.: Solving multiscale linear programs using the simplex method in quadruple precision. In: Numerical Analysis and Optimization, pp. 223–235. Springer, Heidelberg (2015)
Google Scholar
Ma, D., Yang, L., Fleming, R.M., Thiele, I., Palsson, B.O., Saunders, M.A.: Reliable and efficient solution of genome-scale models of metabolism and macromolecular expression. Sci. Rep. 7(1), 1–11 (2017)
Google Scholar
Mallasén, D., Murillo, R., Del Barrio, A.A., Botella, G., Piñuel, L., Prieto-Matias, M.: PERCIVAL: open-source posit RISC-V core with quire capability. IEEE Trans. Emerg. Top. Comput. 10(3), 1241–1252 (2022)
Article Google Scholar
Murillo, R., Del Barrio, A.A., Botella, G.: Deep pensieve: a deep learning framework based on the posit number system. Digit. Signal Process. 102, 102762 (2020)
Article Google Scholar
Omtzigt, E.T.L., Quinlan, J.: Universal numbers library: Multi-format variable precision arithmetic library. J. Open Source Softw. 8(83), 5072 (2023). https://doi.org/10.21105/joss.05072
Palmer, T.N.: More reliable forecasts with less precise computations: a fast-track route to cloud-resolved weather and climate simulators? Philos. Trans. R. Soc. A Math. Phys. Eng. Sci. 372(2018), 20130391 (2014)
Article MathSciNet Google Scholar
Suraksha, E.P.: Bengaluru-based calligo tech to receive first silicon with posit computing capability this month. The Econmoic Times (2024)
Google Scholar
Svyatkovskiy, A., Kates-Harbeck, J., Tang, W.: Training distributed deep recurrent neural networks with mixed precision on GPU clusters. In: Proceedings of the Machine Learning on HPC Environments, pp. 1–8. ACM (2017)
Google Scholar
Wilkinson, J.H.: Rounding Errors in Algebraic Processes. Prentice-Hall Inc., Hoboken (1963)
Google Scholar
Wu, C., Wang, M., Chu, X., Wang, K., He, L.: Low-precision floating-point arithmetic for high-performance FPGA-based CNN acceleration. ACM Trans. Reconfigurable Technol. Syst. (TRETS) 15(1), 1–21 (2021)
Google Scholar

Download references

Acknowledgments

The first author acknowledges partial support for this research by the Maine Economic Improvement Fund (MEIF). However, the authors received no direct financial support for the research, authorship, and/or publication of this article.

Author information

Authors and Affiliations

University of Southern Maine, Portland, ME, 04104, USA
James Quinlan
Stillwater Supercomputing Inc., El Dorado Hills, CA, 95762, USA
E. Theodore L. Omtzigt

Authors

James Quinlan
View author publications
You can also search for this author in PubMed Google Scholar
E. Theodore L. Omtzigt
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to James Quinlan .

Editor information

Editors and Affiliations

National Supercomputing Centre Singapore, Singapore, Singapore
Marek Michalewicz
Arizona State University, Tempe, AZ, USA
John Gustafson
Agency for Science, Technology and Research, Singapore, Singapore
Himeshi De Silva

Ethics declarations

Disclosure of Interests

The authors have no competing interests to declare relevant to the content of this article.

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Quinlan, J., Omtzigt, E.T.L. (2024). Iterative Refinement with Low-Precision Posit Arithmetic. In: Michalewicz, M., Gustafson, J., De Silva, H. (eds) Next Generation Arithmetic. CoNGA 2024. Lecture Notes in Computer Science, vol 14666. Springer, Cham. https://doi.org/10.1007/978-3-031-72709-2_3

Download citation

DOI: https://doi.org/10.1007/978-3-031-72709-2_3
Published: 17 October 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-72708-5
Online ISBN: 978-3-031-72709-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics