Abstract
The increase in performance of the last generations of graphics processors (GPUs) has made this class of hardware a coprocessing platform of remarkable success in certain types of operations. In this paper we evaluate the performance of linear algebra and image processing routines, both on classical and unified GPU architectures and traditional processors (CPUs). From this study, we gain insights on the properties that make an algorithm likely to deliver high performance on a GPU.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Barrachina, S., Castillo, M., Igual, F.D., Mayo, R., Quintana-Ortí, E.S.: Evaluation and tuning of the level 3 CUBLAS for graphics processors. In: Workshop on Multithreaded Architectures and Applications, MTAAP 2008 (2008)
NVIDIA Corp. NVIDIA CUBLAS Library (2007)
NVIDIA Corp. NVIDIA CUDA Compute Unified Device Architecture. Programming Guide (2007)
Fatahalian, K., Sugerman, J., Hanrahan, P.: Understanding the efficiency of GPU algorithms for matrix-matrix multiplication. Graphics Hardware (2004)
Basic Linear Algebra Subprograms Technical (BLAST) Forum. Basic Linear Algebra Subprograms Technical (BLAST) Forum Standard (2001)
Galoppo, N., Govindaraju, N., Henson, M., Monocha, D.: LU-GPU: Efficient algorithms for solving dense linear systems on graphics hardware. In: ACM/IEEE SC 2005 Conference (2005)
Goto, K., Van de Geijn, R.: High-performance implementation of the level-3 BLAS. ACM Transactions on Mathematical Software
Govindaraju, N., Lloyd, B., Wang, W., Lin, M., Manocha, D.: Fast computation of database operations using graphics processors. In: Proceedings of the 2004 ACM SIGMOD International Conference on Management of Data, pp. 215–226 (June 2004)
Hong, J.Y., Wang, M.D.: High speed processing of biomedical images using programmable GPU. In: 2004 International Conference on Image Processing, ICIP 2004, 24-27 October 2004, vol. 4, pp. 2455–2458 (2004)
Larsen, E.S., McAllister, D.: Fast matrix multiplies using graphics hardware. In: Supercomputing, ACM/IEEE 2001 Conference, p. 43 (November 2001)
Moravánszky, A.: Dense matrix algebra on the GPU (2003)
Ruiz, A., Sertel, O., Ujaldon, M., Catalyurek, U., Saltz, J., Gurcan, M.: Pathological image analysis using the GPU: Stroma classification for neuroblastoma. In: Proceedings IEEE Intl. Conference on BioInformation and Bio Medicine (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Igual, F.D., Mayo, R., Quintana-Ortí, E.S. (2008). Attaining High Performance in General-Purpose Computations on Current Graphics Processors. In: Palma, J.M.L.M., Amestoy, P.R., Daydé, M., Mattoso, M., Lopes, J.C. (eds) High Performance Computing for Computational Science - VECPAR 2008. VECPAR 2008. Lecture Notes in Computer Science, vol 5336. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-92859-1_37
Download citation
DOI: https://doi.org/10.1007/978-3-540-92859-1_37
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-92858-4
Online ISBN: 978-3-540-92859-1
eBook Packages: Computer ScienceComputer Science (R0)