Parallel Processing of Matrix Multiplication in a CPU and GPU Heterogeneous Environment

Ohshima, Satoshi; Kise, Kenji; Katagiri, Takahiro; Yuba, Toshitsugu

doi:10.1007/978-3-540-71351-7_24

Parallel Processing of Matrix Multiplication in a CPU and GPU Heterogeneous Environment

Satoshi Ohshima¹,
Kenji Kise¹,
Takahiro Katagiri¹ &
…
Toshitsugu Yuba¹

Conference paper

1038 Accesses
19 Citations

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4395))

Abstract

GPUs for numerical computations are becoming an attractive alternative in research. In this paper, we propose a new parallel processing environment for matrix multiplications by using both CPUs and GPUs. The execution time of matrix multiplications can be decreased to 40.1% by our method, compared with using the fastest of either CPU only case or GPU only case. Our method performs well when matrix sizes are large.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

gpgpu.org: General-Purpose computation on GPUs(GPGPU), http://gpgpu.org/
Thompson, C.J., Hahn, S., Oskin, M.: Using Modern Graphics Architectures for General-Purpose Computing: A Framework and Analysis. In: Proceedings of the 35th annual ACM/IEEE International Symposium on Microarchitecture, pp. 306–317. IEEE Computer Society Press, Los Alamitos (2002)
Google Scholar
Owens, J.D., et al.: A Survey of General-Purpose Computation on Graphics Hardware. In: Eurographics 2005, State of the Art Reports, Dublin, Ireland, pp. 21–51 (2005)
Google Scholar
Higham, N.J.: Exploiting Fast Matrix Multiplication Within the Level 3 BLAS. ACM Transactions on Mathematical Software 16, 352–368 (1990)
Article MATH MathSciNet Google Scholar
Whaley, R.C., Petitet, A., Dongarra, J.J.: Automated Empirical Optimization of Software and the ATLAS Project. Parallel Computing 27(1–2), 3–35 (2001)
Article MATH Google Scholar
John Montrym, H.M.: THE GEFORCE 6800. IEEE MICRO 2005 25(2) (2005)
Google Scholar
Fernando, R.: GPU Gems: Programming Techniques, Tips and Tricks for Real-Time Graphics. Addison-Wesley, Reading (2004)
Google Scholar
Shinomoto, Y., et al.: Consideration for Speculative Rendering in PVR. In: IPSJ SIG Technical Reports, 2005-ARC-164, pp. 145–150 (2005)
Google Scholar
Amada, T., et al.: Partivle-Based Fluid Simulation on GPU. In: ACM Workshop on General-Purpose Computing on Graphics Processors, ACM Press, New York (2004)
Google Scholar
Moravánszky, A.: Dense Matrix Algebra on the GPU, ShaderX2 (2003)
Google Scholar
Krüger, J., Westermann, R.: Linear Algebra Operators for GPU Implementation of Numerical Algorithms. In: Proceedings of ACM SIGGRAPH 2003, pp. 908–916. ACM Press, New York (2003)
Chapter Google Scholar
Moreland, K., Angel, E.: The FFT on a GPU. In: Proc. SIGGRAPH / EUROGRAPHICS Workshop Graphics Hardware, pp. 112–119 (2003)
Google Scholar
Hillesland, K., Lastra, A.: GPU floating-point paranoia. In: Proceedings of GP2 (2004)
Google Scholar
Larsen, E.S., McAllister, D.: Fast matrix multiplies using graphics hardware. In: Proceedings of the 2001 ACM/IEEE conference on Supercomputing, IEEE Computer Society Press, Los Alamitos (2001)
Google Scholar
Fatahalian, K., Sugerman, J., Hanrahan, P.: Understanding the Efficiency of GPU Algorithms for Matrix-Matrix Multiplication. In: Graphics Hardware 2004 (2004)
Google Scholar
Hall, J.D., Carr, N.A., Hart, J.C.: Cache and Bandwidth Aware Matrix Multiplication on the GPU. Technical report, University of Illinois Dept. of Computer Science (2003)
Google Scholar
Jiang, C., Snir, M.: Automatic Tuning Matrix Multiplication Performance on Graphics Hardware. In: Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques (PACT’05), pp. 185–196 (2005)
Google Scholar
Blackford, L.S., et al.: Practical experience in the numerical dangers of heterogeneous computing. ACM Transactions on Mathematical Software (TOMS) 23, 133–147 (1997)
Article MATH Google Scholar
Microsoft: DirectX Developer Center, http://msdn.microsoft.com/directx/

Download references

Author information

Authors and Affiliations

Graduate School of Information Systems, The University of Electro-Communications, 1-5-1, Chofugaoka, Chofu-shi, Tokyo, Japan
Satoshi Ohshima, Kenji Kise, Takahiro Katagiri & Toshitsugu Yuba

Authors

Satoshi Ohshima
View author publications
You can also search for this author in PubMed Google Scholar
Kenji Kise
View author publications
You can also search for this author in PubMed Google Scholar
Takahiro Katagiri
View author publications
You can also search for this author in PubMed Google Scholar
Toshitsugu Yuba
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Michel Daydé José M. L. M. Palma Álvaro L. G. A. Coutinho Esther Pacitti João Correia Lopes

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ohshima, S., Kise, K., Katagiri, T., Yuba, T. (2007). Parallel Processing of Matrix Multiplication in a CPU and GPU Heterogeneous Environment. In: Daydé, M., Palma, J.M.L.M., Coutinho, Á.L.G.A., Pacitti, E., Lopes, J.C. (eds) High Performance Computing for Computational Science - VECPAR 2006. VECPAR 2006. Lecture Notes in Computer Science, vol 4395. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-71351-7_24

Download citation

DOI: https://doi.org/10.1007/978-3-540-71351-7_24
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-71350-0
Online ISBN: 978-3-540-71351-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics