Skip to main content

Adapting Linear Algebra Codes to the Memory Hierarchy Using a Hypermatrix Scheme

  • Conference paper
Book cover Parallel Processing and Applied Mathematics (PPAM 2005)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 3911))

Abstract

We present the way in which we adapt data and computations to the underlying memory hierarchy by means of a hierarchical data structure known as hypermatrix. The application of orthogonal block forms produced the best performance for the platforms used.

This work was supported by the Ministerio de Ciencia y Tecnología of Spain (TIN2004-07739-C02-01).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Fuchs, G., Roy, J., Schrem, E.: Hypermatrix solution of large sets of symmetric positive-definite linear equations. Comp. Meth. Appl. Mech. Eng. 1, 197–216 (1972)

    Article  MATH  Google Scholar 

  2. Noor, A., Voigt, S.: Hypermatrix scheme for the STAR–100 computer. Comp. & Struct. 5, 287–296 (1975)

    Article  Google Scholar 

  3. Ast, M., Fischer, R., Manz, H., Schulz, U.: PERMAS: User’s reference manual, INTES publication no. 450, rev.d (1997)

    Google Scholar 

  4. Chatterjee, S., Jain, V.V., Lebeck, A.R., Mundhra, S., Thottethodi, M.: Nonlinear array layouts for hierarchical memory systems. In: Proceedings of the 13th international conference on Supercomputing, pp. 444–453. ACM Press, New York (1999)

    Chapter  Google Scholar 

  5. Frens, J.D., Wise, D.S.: Auto-blocking matrix multiplication, or tracking BLAS3 performance from source code. Proc. 6th ACM SIGPLAN Symp. on Principles and Practice of Parallel Program, SIGPLAN Not., 32, 206–216 (1997)

    Google Scholar 

  6. Valsalam, V., Skjellum, A.: A framework for high-performance matrix multiplication based on hierarchical abstractions, algorithms and optimized low-level kernels. Concurrency and Computation: Practice and Experience 14, 805–839 (2002)

    Article  MATH  Google Scholar 

  7. Wise, D.S.: Ahnentafel indexing into Morton-ordered arrays, or matrix locality for free. In: Bode, A., Ludwig, T., Karl, W.C., Wismüller, R. (eds.) Euro-Par 2000. LNCS, vol. 1900, pp. 774–783. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  8. Mellor-Crummey, J., Whalley, D., Kennedy, K.: Improving memory hierarchy performance for irregular applications. In: Proceedings of the 13th international conference on Supercomputing, pp. 425–433. ACM Press, New York (1999)

    Chapter  Google Scholar 

  9. Wise, D.S.: Representing matrices as quadtrees for parallel processors. Information Processing Letters 20, 195–199 (1985)

    Article  MathSciNet  Google Scholar 

  10. Herrero, J.R., Navarro, J.J.: Automatic benchmarking and optimization of codes: an experience with numerical kernels. In: Proceedings of the 2003 International Conference on Software Engineering Research and Practice, pp. 701–706. CSREA Press (2003)

    Google Scholar 

  11. Herrero, J.R., Navarro, J.J.: Improving Performance of Hypermatrix Cholesky Factorization. In: Kosch, H., Böszörményi, L., Hellwagner, H. (eds.) Euro-Par 2003. LNCS, vol. 2790, pp. 461–469. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  12. Intel: Intel(R) Itanium(R) 2 processor reference manual for software development and optimization (2004)

    Google Scholar 

  13. Lam, M., Rothberg, E., Wolf, M.: The cache performance and optimizations of blocked algorithms. In: Proceedings of ASPLOS 1991, pp. 67–74 (1991)

    Google Scholar 

  14. Navarro, J.J., Juan, A., Lang, T.: MOB forms: A class of Multilevel Block Algorithms for dense linear algebra operations. In: Proceedings of the 8th International Conference on Supercomputing, ACM Press, New York (1994)

    Google Scholar 

  15. Whaley, R.C., Dongarra, J.J.: Automatically tuned linear algebra software. In: Supercomputing 1998, pp. 211–217. IEEE Computer Society, Los Alamitos (1998)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Herrero, J.R., Navarro, J.J. (2006). Adapting Linear Algebra Codes to the Memory Hierarchy Using a Hypermatrix Scheme. In: Wyrzykowski, R., Dongarra, J., Meyer, N., Waśniewski, J. (eds) Parallel Processing and Applied Mathematics. PPAM 2005. Lecture Notes in Computer Science, vol 3911. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11752578_128

Download citation

  • DOI: https://doi.org/10.1007/11752578_128

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-34141-3

  • Online ISBN: 978-3-540-34142-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics