Abstract.
Solving systems of linear equations is central in scientific computation. In this paper, we focus on using Intel’s Pentium Streaming SIMD Extensions (SSE) for parallel implementation of LU-decomposition algorithm. Two implementations (non-SSE and SSE) of LU-decomposition are compared. Moreover, two different variants of the algorithm for the SSE version are also compared. Our results demonstrate an average performance of 2.25 times faster than the non-SSE version. This speedup is higher than 1.74 times the speedup of Intel’s SSE implementation. The source of the speedup is highly reusing of loaded data by efficiently organizing SSE instructions.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Dongarra, J.J., Duff, I.S., Sorensen, D.C., Van der Vorst, H.A.: Solving Linear Systems on Vector and Shared Memory Computers. SIAM Publications, Philadelphia (1991)
Golub, G.H., Van Loan, C.F.: Matrix Computations, 3rd edn. The Johns Hopkins University Press, Baltimore (1996)
Raman, S.K., Pentkovski, V., Keshava, J.: Implementing Streaming SIMD Extensions on the Pentium III Processor. IEEE Micro 20(4), 47–57 (2000)
IA-32 Intel Architecture Software Developer’s Manual - Volume 2 Instruction Set Reference. Intel Corporation (2003)
Intel Architecture Optimization Reference Manual. Intel Corporation (1999)
Streaming SIMD Extensions - LU Decomposition. Intel Application Note AP-931, Intel Corporation (1999)
Dongarra, J.J., Du Croz, J., Hammarling, S., Hanson, R.J.: An Extended Set of Fortran Basic Linear Algebra Subprograms. ACM Transactions on Mathematical Software 14(1), 1–17 (1988)
Fung, Y.F., Ercan, M.F., Ho, T.K., Cheung, W.L.: A Parallel Solution to Linear Systems. Microprocessors and Microsystems 26(1), 39–44 (2002)
Using the RDTSC Instruction for Performance Monitoring. Intel Application Note, Intel Corporation (2000)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Takahashi, A., Soliman, M., Sedukhin, S. (2003). Parallel LU-decomposition on Pentium Streaming SIMD Extensions. In: Veidenbaum, A., Joe, K., Amano, H., Aiso, H. (eds) High Performance Computing. ISHPC 2003. Lecture Notes in Computer Science, vol 2858. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-39707-6_37
Download citation
DOI: https://doi.org/10.1007/978-3-540-39707-6_37
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-20359-9
Online ISBN: 978-3-540-39707-6
eBook Packages: Springer Book Archive