Abstract
This paper compares various contemporary multicore-based microprocessor architectures from different vendors with different memory interconnects regarding performance, speedup, and parallel efficiency. Sparse matrix decomposition is used as a benchmark application. The example matrix used in the experiments comes from an electrical engineering application, where numerical simulation of physical processes plays an important role in the design of industrial products.
Within this context, thread-to-core pinning and cache optimization are two important aspects which are investigated in more detail.
Similar content being viewed by others
References
Banerjee PK (1994) The boundary element methods in engineering. McGraw-Hill College, New York
Chan SM, Brandwajn V (1986) Partial matrix refactorization. In: IEEE transactions on power systems, vol PWRS-1, No 1, February 1986
Cuthill E, McKee J (1969) Reducing the bandwidth of sparse symmetric matrices. In: ACM annual conference/annual meeting, Proceedings of the 24th national conference, pp 157–172
DiME DFG Project, Web page http://www10.informatik.uni-erlangen.de/Research/Projects/DiME
Duff IS, Erisman AM, Reid JK (1986) Direct methods for sparse matrices. Oxford University Press, Oxford
Golub GK, Van Loan CF (1996) Matrix computations, 3rd edn. John Hopkins University Press, New York
Harrington RF (1992) Field computation by moment methods. IEEE Press, Piscataway
Jin J (1993) The finite element method in electromagnetics. Wiley, Chichester
Klug T, Ott M, Weidendorfer J, Trinitis C (2008) autopin -â- automated optimization of thread-to-core pinning on multicore system. Trans High-Perform Embed Archit Compil 3(4)
National Institute of Standards and Technology. Matrix Market http://math.nist.gov/MatrixMarket/
Ott M, Klug T, Weidendorfer J, Trinitis C (2008) autopin—automated optimization of thread-to-core pinning on multicore systems. In: First workshop on programmability issues for multi-core computers (MULTIPROG). Workshop proceedings, 1st Multiprog workshop, Gothenburg, Sweden, January 2008
Taflove A, Hagness SC (2005) Computational electrodynamics: the finite-difference time-domain method, 3rd edn. Artech House, Boston
OpenMP.org. The OpenMP API specification for parallel programming http://www.openmp.org/
Tinney WF, Brandwajn V, Chan SM (1985) Sparse vector methods. In: IEEE transactions on power apparatus and systems, vol PAS-104, No 2, February 1985
The Valgrind Developers. Valgrind Web page http://valgrind.org/
Weidendorfer J KCachegrind Web page http://kcachegrind.sourceforge.net/
Weidendorfer J, Kowarschik M, Trinitis C (2004) A tool suite for simulation based analysis of memory access behavior. In: ICCS 2004: 4th international conference on computational science. LNCS, vol 3038. Springer, Berlin, pp 440–447
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Trinitis, C., Küstner, T., Weidendorfer, J. et al. Sparse matrix operations on several multi-core architectures. J Supercomput 57, 132–140 (2011). https://doi.org/10.1007/s11227-010-0428-9
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-010-0428-9