Skip to main content

Abstract

This chapter deals with fundamental theories on the accuracy of numerical calculation and some cases that seems to be important, somewhat different from previous chapters. We must remember that numerical errors are included in the output data of the computer. In particular, do not overlook the important points you need to know when parallelizing codes. Pursuit of calculation speed is, of course, the central theme of this book, however, it is premised that it produces correct results. This chapter introduces a numerical computation method with guaranteed accuracy in large-scale numerical computations, convergence accuracy problems in parallel computing, and high-precision calculation in HPC.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    For details on specifications in which there are two closest floating-point numbers or an arithmetic is approaching overflow, please consult the standard [1].

  2. 2.

    In Fortran language, when in conformity with the Fortran2003 standard, modules IEEE_ARITHMETIC or IEEE_FEATURES make it possible to change the rounding mode. For instance, by inputting the text CALL IEEE_SET_ROUNDING_MODE(IEEE_NEAREST), round-to-nearest will be the selected mode. The mode will be changed to rounding up if IEEE_NEAREST is replaced with IEEE_UP, and changed to rounding down if replaced with IEEE_DOWN.

  3. 3.

    If the computation order is changed owing to compiler optimization, an operation not conforming to the IEEE 754 Standard may be performed, unintentionally resulting in an operation that does not constitute a numerical computation with guaranteed accuracy. Therefore, in order to inhibit optimization, it is necessary to add a volatile attribute stipulated in C language and Fortran2003 standards as a variable, or to set up arithmetic so that they more strictly conform if optimization options (-fp-model etc.) for floating-point numbers are included as compiler options.

  4. 4.

    When __float128 is defined in the standard C/C++, it is only necessary to switch __float128 with typedef. C/C++ are problematic in the size or interpretation of numbers; they may be different in different implementations or architectures, e.g., the “long double” can be either 80 bit (extended precision), IEEE 754 binary64 (double-precision), or IEEE 754 binary128 (quadruple precision). Moreover, “long double” means double-double by compilers on IBM Power processors. Even with the same 64-bit architecture, the data models such as LLP 64, LP 64, and ILP 64 are different; if two different binaries with the same program on the same machine and the same OS is compiled by two different compilers using different data models, they may give different results (segmentation fault usually occurs for unintended data model).

  5. 5.

    FMA performs \(a\times b+c\) in one clock, and it performs \(a\times b + c\) exactly and rounds the result to double-precision. It is often used for inner product calculations and matrix–matrix multiplications. Why such hardware is implemented in recent CPUs is that since every instruction must be processed in one clock, both an adder and multiplier must exist in its arithmetic unit. The processor would stall if this were not the case. Implementing FMA on a CPU is a good way to these utilizing these two operators maximally, as it fills up the adder and multiplier in the arithmetic unit.

  6. 6.

    SSE4 and AVX4 stand for Streaming SIMD Extensions and Intel Advanced Vector Extensions, and they can perform operations such as double-precision numbers collectively with one instruction.

References

  1. IEEE Standard for Floating-Point Arithmetic, Std 754–2008 (2008)

    Google Scholar 

  2. S. Oishi, Numerical Methods with Guaranteed Accuracy (Corona-sya, 2000, Japanese)

    Google Scholar 

  3. E. Ramon, R. Moore, B. Kearfott, J. Michael, Introduction to Interval Analysis (Society for Industrial and Applied Mathematics, Cloud, 2009)

    Google Scholar 

  4. S. Oishi, S.M. Rump, Fast verification of solutions of matrix equations. Numer. Math. 90(4), 755–773 (2002)

    Article  MathSciNet  Google Scholar 

  5. T. Ogita, S.M. Rump, S. Oishi, Verified solution of linear systems without directed rounding, Technical Report 2005–04 (Waseda University, Tokyo, Japan, Advanced Research Institute for Science and Engineering, 2005)

    Google Scholar 

  6. N.J. Higham, Accuracy and Stability of Numerical Algorithms, 2nd edn. (SIAM Publications, Philadelphia, 2002)

    Book  Google Scholar 

  7. T. Ogita, S.M. Rump, S. Oishi, Accurate sum and dot product. SIAM J. Sci. Comput. (SISC) 26(6), 1955–1988 (2005)

    Article  MathSciNet  Google Scholar 

  8. S. Koshizuka, Y. Oka, Moving-particle semi-implicit method for fragmentation of incompressible fluid. Nuclear Sci. Eng. 123, 421–434 (1996)

    Article  Google Scholar 

  9. H. Togawa, Conjugate Gradient Method (Kyoiku Shuppan, 1977, in Japanese)

    Google Scholar 

  10. IEEE, IEEE standard for floating-point arithmetic, IEEE Std 754-2008, pp. 1–70 (2008)

    Google Scholar 

  11. D.H. Bailey, R. Barrio, J.M. Borwein, High precision computation: mathematical physics and dynamics. Appl. Math. Comput. 218, 10106–10121 (2012)

    Article  MathSciNet  Google Scholar 

  12. D.H. Bailey, J.M. Borwein, High-precision arithmetic in mathematical physics. Mathematics 3, 337–367 (2015)

    Article  Google Scholar 

  13. G. Beliakov, Y. Matiyasevich, A parallel algorithm for calculation of large determinants with high accuracy for GPUs and MPI clusters. arXiv:1308.1536v2

  14. N.J. Higham, SIAM: Society for Industrial and Applied Mathematics, 2nd edn. (2002)

    Google Scholar 

  15. H. Hasegawa, Utilizing the quadruple-precision floating-point arithmetic operation for the krylov subspace methods, in Proceedings of the 8th SIAM Conference on Applied Linear Algebra, vol. 25 (2012)

    Google Scholar 

  16. M. Nakata, B.J. Braams, K. Fujisawa, M. Fukuda, J.K. Percus, M. Yamashita, Z. Zhao, Variational calculation of second-order reduced density matrices by strong n-representability conditions and an accurate semidefinite programming solver. J. Chem. Phys. 128, 164113 (2008)

    Article  Google Scholar 

  17. H. Waki, M. Nakata, M. Muramatsu, Strange behaviors of interior-point methods for solving semidefinite programming problems in polynomial optimization. Comput. Opt. Appl. 53, 823 (2012)

    Article  MathSciNet  Google Scholar 

  18. F. Bornemann, D. Laurie, S. Wagon, J. Waldvogel, The SIAM 100-Digit Challenge: A Study in High-Accuracy Numerical Computing (Society for Industrial and Applied Mathematics, SIAM, 2004)

    Google Scholar 

  19. D.E. Knuth, Art of Computer Programming, Volume 2: Seminumerical Algorithms, 3rd edn. (Addison-Wesley Professional, 1997)

    Google Scholar 

  20. T.J. Dekker, A floating-point technique for extending the available precision. Numerische Math. 18, 224–242 (1971)

    Article  MathSciNet  Google Scholar 

  21. Y. Hida, X.S. Li, D.H. Bailey, Library for double-double and quad-double arithmetic, Technical report (Lawrence Berkeley National Laboratory, 2008)

    Google Scholar 

  22. M. Nakata, Y. Takao, S. Noda, R. Himeno, A fast implementation of matrix-matrix product in double-double precision on nvidia C2050 and application to semidefinite programming, in Third International Conference on Networking and Computing (ICNC) (2012)

    Google Scholar 

  23. T. Granlund, Gmp Development Team, GNU MP 6.0 Multiple Precision Arithmetic Library (Samurai Media Limited, United Kingdom, 2015)

    Google Scholar 

  24. L. Fousse, G. Hanrot, V. Lefevre, P. Pélissier, P. Zimmermann, MPFR: a multiple-precision binary floating-point library with correct rounding. ACM Trans. Math. Softw. 33, 13 (2007)

    Article  Google Scholar 

  25. A. Enge, M. Gastineau, P. Théveny, P. Zimmermann, mpc—a library for multiprecision complex arithmetic with exact rounding, INRIA, 1.0.3 edn., Feb 2015

    Google Scholar 

  26. M. Nakata, MPACK, RIKEN, 0.8.0 edn. (2012)

    Google Scholar 

  27. M. Nakata, Mpack0.6.7: a high precision linear algebra library. Appl. Math. 2110 (2011, In Japanese)

    Google Scholar 

  28. T. Koya, BNCpack, 0.7 edn. (Shizuoka Institute of Science and Technology, 2011)

    Google Scholar 

  29. B.N. Parlett, The Symmetric Eigenvalue Problem (Classics in Applied Mathematics) (Society for Industrial Mathematics, 1987)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shin’chi Oishi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Singapore Pte Ltd.

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Oishi, S., Morikura, Y., Sekine, K., Kuroda, H., Nakata, M. (2019). Techniques Concerning Computation Accuracy. In: Geshi, M. (eds) The Art of High Performance Computing for Computational Science, Vol. 1. Springer, Singapore. https://doi.org/10.1007/978-981-13-6194-4_10

Download citation

  • DOI: https://doi.org/10.1007/978-981-13-6194-4_10

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-13-6193-7

  • Online ISBN: 978-981-13-6194-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics