A Study on Load Imbalance in Parallel Hypermatrix Multiplication Using OpenMP

Herrero, José R.; Navarro, Juan J.

doi:10.1007/11752578_16

José R. Herrero²⁰ &
Juan J. Navarro²⁰

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 3911))

Included in the following conference series:

International Conference on Parallel Processing and Applied Mathematics

800 Accesses
1 Citations

Abstract

In this paper we present our work on the the parallelization of a matrix multiplication code based on the hypermatrix data structure. We have used OpenMP for the parallelization. We have added OpenMP directives to a few loops and experimented with several features available with OpenMP in the Intel Fortran Compiler: scheduling algorithms, chunk sizes and nested parallelism. We found that the load imbalance introduced by the hypermatrix structure could not be solved by any of those OpenMP features.

This work was supported by the Ministerio de Ciencia y Tecnología of Spain (TIN2004-07739-C02-01).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Fuchs, G., Roy, J., Schrem, E.: Hypermatrix solution of large sets of symmetric positive-definite linear equations. Comp. Meth. Appl. Mech. Eng. 1, 197–216 (1972)
Article MATH Google Scholar
Noor, A., Voigt, S.: Hypermatrix scheme for the STAR–100 computer. Comp. & Struct. 5, 287–296 (1975)
Article Google Scholar
Whaley, R.C., Dongarra, J.J.: Automatically tuned linear algebra software. In: Supercomputing 1998, pp. 211–217. IEEE Computer Society, Los Alamitos (1998)
Google Scholar
Choi, J., Dongarra, J., Pozo, R., Walker, D.: ScaLAPACK: a scalable linear algebra library for distributed memory concurrent computers. In: Proc. Fourth Symposium on the Frontiers of Massively Parallel Computation, pp. 120–127. ACM Press, New York (1992)
Chapter Google Scholar
Chatterjee, S., Lebeck, A.R., Patnala, P.K., Thottethodi, M.: Recursive array layouts and fast parallel matrix multiplication. In: Proceedings of the eleventh annual ACM symposium on Parallel algorithms and architectures, pp. 222–231. ACM Press, New York (1999)
Google Scholar
OpenMP: (URL) http://www.openmp.org
Ast, M., Fischer, R., Manz, H., Schulz, U.: PERMAS: User’s reference manual, INTES publication no. 450, rev.d (1997)
Google Scholar
Chatterjee, S., Jain, V.V., Lebeck, A.R., Mundhra, S., Thottethodi, M.: Nonlinear array layouts for hierarchical memory systems. In: Proceedings of the 13th international conference on Supercomputing, pp. 444–453. ACM Press, New York (1999)
Chapter Google Scholar
Frens, J.D., Wise, D.S.: Auto-blocking matrix multiplication, or tracking BLAS3 performance from source code. Proc. 6th ACM SIGPLAN Symp. on Principles and Practice of Parallel Program, SIGPLAN Not, 32(7), 206–216 (1997)
Google Scholar
Valsalam, V., Skjellum, A.: A framework for high-performance matrix multiplication based on hierarchical abstractions, algorithms and optimized low-level kernels. Concurrency and Computation: Practice and Experience 14(10), 805–839 (2002)
Article MATH Google Scholar
Wise, D.S.: Ahnentafel indexing into Morton-ordered arrays, or matrix locality for free. In: Bode, A., Ludwig, T., Karl, W.C., Wismüller, R. (eds.) Euro-Par 2000. LNCS, vol. 1900, pp. 774–783. Springer, Heidelberg (2000)
Chapter Google Scholar
Mellor-Crummey, J., Whalley, D., Kennedy, K.: Improving memory hierarchy performance for irregular applications. In: Proceedings of the 13th international conference on Supercomputing, pp. 425–433. ACM Press, New York (1999)
Chapter Google Scholar
Wise, D.S.: Representing matrices as quadtrees for parallel processors. Information Processing Letters 20(4), 195–199 (1985)
Article MathSciNet Google Scholar
Herrero, J.R., Navarro, J.J.: Automatic benchmarking and optimization of codes: an experience with numerical kernels. In: Proceedings of the 2003 International Conference on Software Engineering Research and Practice, pp. 701–706. CSREA Press (2003)
Google Scholar
Herrero, J.R., Navarro, J.J.: Improving Performance of Hypermatrix Cholesky Factorization. In: Kosch, H., Böszörményi, L., Hellwagner, H. (eds.) Euro-Par 2003. LNCS, vol. 2790, pp. 461–469. Springer, Heidelberg (2003)
Chapter Google Scholar
Lam, M., Rothberg, E., Wolf, M.: The cache performance and optimizations of blocked algorithms. In: Proceedings of ASPLOS 1991, pp. 67–74 (1991)
Google Scholar

Download references

Author information

Authors and Affiliations

Computer Architecture Dept., Univ. Politècnica de Catalunya, Barcelona, Spain
José R. Herrero & Juan J. Navarro

Authors

José R. Herrero
View author publications
You can also search for this author in PubMed Google Scholar
Juan J. Navarro
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Institute of Computational and Information Sciences, Czestochowa University of Technology, Poland
Roman Wyrzykowski
Computer Science Department,, University of Tennessee, 37996-3450, Knoxville, TN, USA
Jack Dongarra
Poznan Supercomputing and Networking Center, Poland
Norbert Meyer
Informatics & Mathematical Modeling, Technical University of Denmark, 2800, Lyngby, DK, Denmark
Jerzy Waśniewski

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Herrero, J.R., Navarro, J.J. (2006). A Study on Load Imbalance in Parallel Hypermatrix Multiplication Using OpenMP. In: Wyrzykowski, R., Dongarra, J., Meyer, N., Waśniewski, J. (eds) Parallel Processing and Applied Mathematics. PPAM 2005. Lecture Notes in Computer Science, vol 3911. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11752578_16

Download citation

DOI: https://doi.org/10.1007/11752578_16
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-34141-3
Online ISBN: 978-3-540-34142-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics