Basics and Practice of Linear Algebra Calculation Library BLAS and LAPACK

Nakata, Maho

doi:10.1007/978-981-13-6194-4_6

Maho Nakata²

1001 Accesses
1 Citations

Abstract

In this chapter, we explain the basic architecture and use of the linear algebra calculation libraries called BLAS and LAPACK. BLAS and LAPACK libraries are for carrying out vector and matrix operations on computers. They are used by many programs, and their implementations are optimized according to the computer they are run on. These libraries should be used whenever possible for linear algebra operations. This is because algorithms based directly on mathematical theorems in textbooks may be inefficient and their results may not have sufficient accuracy in practice. Moreover, programming such algorithms are bothersome. However, performance may suffer if you use a non-optimized library. In fact, the difference in performance between a non-optimized and optimized one is likely very large, so you should choose the fastest one for your computer. The availability of optimized BLAS and LAPACK libraries have improved remarkably. For example, they are now included in Linux distributions such as Ubuntu. In this chapter, we will refer to the libraries for Ubuntu 16.04 so that readers can easily try them out for themselves. Unfortunately, we will not mention GPU implementations on account of lack of space. However, the basic ideas are the same as presented in this chapter; therefore, we believe that readers will easily be able to utilize them as well.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 99.00; Price excludes VAT (USA)

Softcover Book: USD 129.99; Price excludes VAT (USA)

Hardcover Book: USD 129.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Although Carl Friedrich Gauss is sometimes credited with the rediscovery, Isaac Newton, a 100 years earlier, wrote that textbooks of the day lacked a method for solving simultaneous equations and proceeded to publish one that became well circulated.
2.
There once was a time when the format varied from one manufacturer or vendor to the other; when data ceased being compatible or when a computer was replaced, it was necessary to change the program as well.
3.
In multicore environments, programs run in parallel in a light process called “threads.” Because different threads can access the same memory area at the same time. It may induce conflicts. In LAPACK 3.3, all the routines are now thread safe by removing such private variables.
4.
It looks very similar to the textbook implementation. However, in this case, we use sub-matrices instead of numbers. This algorithm makes use of the hierarchical structure of the memory cache; it is also suitable for multicore CPUs because of the independence of each sub-matrix \(C_{pq}\).
5.
The situation before 2010 was quite chaotic, because the source code was hidden by vendors.
6.
The calculation becomes difficult when the clock changes dynamically such as in the case of TurboBoost.
7.
AVX refers to Intel Advanced Vector Extensions which is an extension of the SIMD-type instructions succeeding SSE. It has a 256 bit width and can calculate additions and multiplications in one clock. It can store four double-precision values in 256 bits. It can calculate two multiplications per clock, so it is possible to perform eight operations in 1 clock.
8.
However, it is better to use the Xeon because it has more memory bandwidth despite it being more expensive to run on a Core i7.

References

Author unknown, The nine chapters on the mathematical art, around the 1st century BC to the second century AD
Google Scholar
IEEE, IEEE standard for floating-point arithmetic, IEEE Std 754-2008, pp. 1–70 (2008)
Google Scholar
N.J. Higham, SIAM: Society for Industrial and Applied Mathematics, 2nd edn. (2002)
Google Scholar
S. Hauberg, J. W. Eaton, D. Bateman, GNU Octave Version 3.0.1 Manual: A High-level Interactive Language for Numerical Computations (CreateSpace Independent Publishing Platform, 2008)
Google Scholar
MATLAB, version 7.10.0 (R2010a). The MathWorks Inc., Natick, Massachusetts (2010)
Google Scholar
BLAS quick reference card, http://www.netlib.org/blas/blasqr.pdf
B. Kågström, P. Ling, C. Van Loan, GEMM-based level 3 BLAS: high-performance model implementations and performance evaluation benchmark, ACM Trans. Math. Softw. 24(3), 268–302 (1998)
Article Google Scholar
R.C. Whaley, J.J. Dongarra, in Proceedings of the 1998 ACM/IEEE Conference on Supercomputing, SC ’98, 1 (IEEE Computer Society, Washington, 1998)
Google Scholar
K. Goto, R.A. van de Geijn, ACM Trans. Math. Softw. 34, 12:1 (2008)
Article MathSciNet Google Scholar
X. Zhang, Q. Wang, Y. Zhang, in IEEE 18th International Conference on Parallel and Distributed Systems (ICPADS), vol. 17 (IEEE Computer Society, 2012)
Google Scholar

Download references

Author information

Authors and Affiliations

RIKEN Advanced Center for Computing and Communication, Wako, Japan
Maho Nakata

Authors

Maho Nakata
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Maho Nakata .

Editor information

Editors and Affiliations

Osaka University, Toyonaka, Osaka, Japan
Masaaki Geshi

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Nakata, M. (2019). Basics and Practice of Linear Algebra Calculation Library BLAS and LAPACK. In: Geshi, M. (eds) The Art of High Performance Computing for Computational Science, Vol. 1. Springer, Singapore. https://doi.org/10.1007/978-981-13-6194-4_6

Download citation

DOI: https://doi.org/10.1007/978-981-13-6194-4_6
Published: 15 May 2019
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-6193-7
Online ISBN: 978-981-13-6194-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics