Cache Efficiency and Scalability on Multi-core Architectures

Müller, Thomas; Trinitis, Carsten; Smajic, Jasmin

doi:10.1007/978-3-642-23178-0_8

Thomas Müller¹⁷,
Carsten Trinitis¹⁷ &
Jasmin Smajic¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 6873))

Included in the following conference series:

International Conference on Parallel Computing Technologies

1057 Accesses

Abstract

Two electrical engineering applications from industry partners dealing with sparse matrices were analyzed regarding cache efficiency and scalability on modern multi core systems. Two different contemporary multi-core architectures have been investigated, namely Intel’s Westmere and AMD’s Magny-Cours. This paper can be regarded as a continuation of the investigations presented in [14] and [15].

In addition, the SuiteSparseQR library for efficiently computing QR factorizations of sparse matrices was evaluated regarding scalability and cache efficiency.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Exploiting Data Sparsity for Large-Scale Matrix Computations

ESSEX: Equipping Sparse Solvers For Exascale

Sparse-Aware CARM: Rooflining Locality of Sparse Computations

References

OpenMP: The OpenMP API specification for parallel programming, http://www.openmp.org
Perf Wiki, https://perf.wiki.kernel.org
SuiteSparse: a Suite of Sparse matrix packages, http://www.cise.ufl.edu/research/sparse/SuiteSparse/
SuiteSparseQR: multithreaded multifrontal sparse QR factorization, http://www.cise.ufl.edu/research/sparse/SPQR/
AMD: AMD Core Math Library, http://www.amd.com/acml/
Amdahl, G.: Validity of the single processor approach to achieving large-scale computing capabilities. In: AFIPS Conference Proceedings, vol. 30, pp. 483–485 (1967), http://www-inst.eecs.berkeley.edu/~n252/paper/Amdahl.pdf
Amestoy, P.R., Duff, I.S., Puglisi, C.: Multifrontal qr factorization in a multiprocessor environment. Numerical Linear Algebra with Applications 3(4), 275–300 (1996), http://dx.doi.org/10.1002/SICI1099-150199607/083:4275::AID-NLA833.0.CO2-7
Intel: Intel 64 and IA-32 Architectures Software Developers Manual; Volume 3B: System Programming Guide, Part 2, http://www.intel.com/Assets/PDF/manual/253669.pdf
Intel: Math Kernel Library, http://software.intel.com/en-us/articles/intel-mkl/
Klug, T., Ott, M., Weidendorfer, J., Trinitis, C.: autopin automated optimization of thread-to-core pinning on multicore systems. In: Stenström, P. (ed.) Transactions on High-Performance Embedded Architectures and Compilers III. LNCS, vol. 6590, pp. 219–235. Springer, Heidelberg (2011), http://dx.doi.org/10.1007/978-3-642-19448-1_12
Chapter Google Scholar
Liu, J.W.H.: The multifrontal method for sparse matrix solution: Theory and practice. SIAM Review 34(1), 82–109 (1992), http://link.aip.org/link/?SIR/34/82/1
Article MATH Google Scholar
Matstoms, P.: Sparse linear least squares problems in optimization. Computational Optimization and Applications 7, 89–110 (1997), http://dx.doi.org/10.1023/A:1008680131271
Article MATH Google Scholar
Tinney, W., Brandwajn, V., Chan, S.: Sparse vector methods. IEEE Transactions on Power Apparatus and Systems PAS 104(2), 295–301 (1985)
Article Google Scholar
Trinitis, C., Küstner, T., Weidendorfer, J., Smajic, J.: Sparse matrix operations on multi-core architectures. In: Malyshkin, V. (ed.) PaCT 2009. LNCS, vol. 5698, pp. 41–48. Springer, Heidelberg (2009), http://dx.doi.org/10.1007/978-3-642-03275-2_5
Chapter Google Scholar
Trinitis, C., Küstner, T., Weidendorfer, J., Smajic, J.: Sparse matrix operations on several multi-core architectures. The Journal of Supercomputing, 1–9 (2010), http://dx.doi.org/10.1007/s11227-010-0428-9

Download references

Author information

Authors and Affiliations

Lehrstuhl für Rechnertechnik und Rechnerorganisation, Institut für Informatik, Technische Universität München, Germany
Thomas Müller & Carsten Trinitis
ABB Corporate Research Switzerland, Segelhof 1, Baden, Switzerland
Jasmin Smajic

Authors

Thomas Müller
View author publications
You can also search for this author in PubMed Google Scholar
Carsten Trinitis
View author publications
You can also search for this author in PubMed Google Scholar
Jasmin Smajic
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Institute of Computational Mathematics and Mathematical Geophysics, Supercomputer Software Department, Russian Academy of Sciences Pr. Lavrentieva, ICM&MG RAS, 630090, Novosibirsk, Russia
Victor Malyshkin

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Müller, T., Trinitis, C., Smajic, J. (2011). Cache Efficiency and Scalability on Multi-core Architectures. In: Malyshkin, V. (eds) Parallel Computing Technologies. PaCT 2011. Lecture Notes in Computer Science, vol 6873. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23178-0_8

Download citation

DOI: https://doi.org/10.1007/978-3-642-23178-0_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-23177-3
Online ISBN: 978-3-642-23178-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics