Skip to main content

Rectangular Full Packed Format for LAPACK Algorithms Timings on Several Computers

  • Conference paper
Applied Parallel Computing. State of the Art in Scientific Computing (PARA 2006)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4699))

Included in the following conference series:

Abstract

We describe a new data format for storing triangular and symmetric matrices called RFP (Rectangular Full Packed). The standard two dimensional arrays of Fortran and C (also known as full format) that are used to store triangular and symmetric matrices waste nearly half the storage space but provide high performance via the use of level 3 BLAS. Standard packed format arrays fully utilize storage (array space) but provide low performance as there are no level 3 packed BLAS. We combine the good features of packed and full storage using RFP format to obtain high performance using L3 (level 3) BLAS as RFP is full format. Also, RFP format requires exactly the same minimal storage as packed format. Each full and/or packed symmetric/triangular routine becomes a single new RFP routine. We present LAPACK routines for Cholesky factorization, inverse and solution computation in RFP format to illustrate this new work and to describe its performance on the IBM, Itanium, NEC, and SUN platforms. Performance of RFP versus LAPACK full routines for both serial and SMP parallel processing is about the same while using half the storage. Performance is roughly one to a factor of 33 for serial and one to a factor of 100 for SMP parallel times faster than LAPACK packed routines. Existing LAPACK routines and vendor LAPACK routines were used in the serial and the SMP parallel study, respectively. In both studies vendor L3 BLAS were used.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Andersen, B.S., Gunnels, J., Gustavson, F.G., Reid, J.K., Waśniewski, J.: A fully portable high performance minimal storage hybrid format Cholesky algorithm. TOMS 31, 201–227 (2005)

    Article  MATH  Google Scholar 

  2. Andersen, B.A., Gustavson, F.G., Waśniewski, J.: A recursive formulation of Cholesky factorization of a matrix in packed storage. TOMS 27(2), 214–244 (2001)

    Article  MATH  Google Scholar 

  3. Anderson, E., et al.: LAPACK Users’ Guide Release 3.0, SIAM, Philadelphia (1999), http://www.netlib.org/lapack/

  4. Gunnels, J.A., Gustavson, F.G.: A New Array Format for Symmetric and Triangular Matrices. In: Dongarra, J.J., Madsen, K., Waśniewski, J. (eds.) PARA 2004. LNCS, vol. 3732, pp. 247–255. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  5. Elmroth, E., Gustavson, F.G., Kågström, B., Jonsson, I.: Recursive Blocked Algorithms and Hybrid Data Structures for Dense Matrix Library Software. SIAM Review 46(1), 3–45 (2004)

    Article  MATH  MathSciNet  Google Scholar 

  6. Herrero, J.R.: A Framework for Efficient Execution of Matrix Computations, PhD thesis, Universitat Politècnica de Catalunya (May 2006)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Bo Kågström Erik Elmroth Jack Dongarra Jerzy Waśniewski

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Gustavson, F.G., Waśniewski, J. (2007). Rectangular Full Packed Format for LAPACK Algorithms Timings on Several Computers. In: Kågström, B., Elmroth, E., Dongarra, J., Waśniewski, J. (eds) Applied Parallel Computing. State of the Art in Scientific Computing. PARA 2006. Lecture Notes in Computer Science, vol 4699. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-75755-9_69

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-75755-9_69

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-75754-2

  • Online ISBN: 978-3-540-75755-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics