Abstract
In a wide variety of applications from different scientific and engineering fields, the solution of complex and/or nonsymmetric linear systems of equations is required. To solve this kind of linear systems the BiConjugate Gradient method (BCG) is especially relevant. Nevertheless, BCG has a enormous computational cost. GPU computing is useful for accelerating this kind of algorithms but it is necessary to develop suitable implementations to optimally exploit the GPU architecture. In this paper, we show how BCG can be effectively accelerated when all operations are computed on a GPU. So, BCG has been implemented with two alternative routines of the Sparse Matrix Vector product (SpMV): the CUSPARSE library and the ELLR-T routine. Although our interest is focused on complex matrices, our implementation has been evaluated on a GPU for two sets of test matrices: complex and real, in single and double precision data. Experimental results show that BCG based on ELLR-T routine achieves the best performance, particularly for the set of complex test matrices. Consequently, this method can be useful as a tool to efficiently solve large linear system of equations (complex and/or nonsymmetric) involved in a broad range of applications.


Similar content being viewed by others
References
Baskaran MM, Bordawekar R (2009) Optimizing sparse matrix–vector multiplication on GPUs. Tech rep research report RC24704, IBM
Bell N, Garland M (2009) Implementing sparse matrix–vector multiplication on throughput-oriented processors. In: Proc of the conf on high performance computing networking, storage and analysis, pp 1–11
Bisseling RH (2004) Parallel scientific computation. Oxford University Press, Oxford
Choi JW, Singh A, Vuduc R (2010) Model-driven autotuning of sparse matrix–vector multiply on GPUs. In: PPoPP’10, pp 115–126
De Donno D Alessandra E et al (2011) Iterative solution of linear systems in electromagnetics (and not only): experiences with CUDA. In: Euro-Par 2010 parallel processing workshops. LNCS, vol 6586. Springer, Berlin, pp 329–337
Gaikwad A, Toke IM (2010) Parallel iterative linear solvers on GPU: a financial engineering case. In: Proc of the 2010 18th euromicro conference on parallel, distributed and network-based processing, pp 607–614
Garcia N (2010) Parallel power flow solutions using a biconjugate gradient algorithm and a newton method: a GPU-based approach. In: Power and energy society general meeting. IEEE Press, New York, pp 1–4
Golub GH, van Van Loan CF (1996) Matrix computations (Johns Hopkins studies in mathematical sciences), 3rd edn. Johns Hopkins University Press, Baltimore
INTEL (2009) Math kernel library
Lanczos C (1952) Solution of systems of linear equations by minimized iterations. J Res Natl Bur Stand 49:33–53
Lee VW Kim C Chhugani J Deisher et al (2010) Debunking the 100× GPU vs. CPU myth: an evaluation of throughput computing on CPU and GPU. Comput Archit News 38:451–460
Lobera J, Coupland JM (2008) Optical diffraction tomography in fluid velocimetry: the use of a priori information. Meas Sci Technol 19(7):074,013
NVIDIA (2010) CUDA CUSPARSE library. Tech rep. http://www.nvidia.com/content/GTC-2010/pdfs/2070_GTC2010.pdf
NVIDIA (2010) Cusp library. Tech rep
NVIDIA (2010) Next generation CUDA architecture. Fermi architecture
Saad Y (2003) Iterative methods for sparse linear systems, 2nd edn. SIAM, Philadelphia
Vázquez F, Fernández JJ, Garzón EM (2011) Automatic tuning of the sparse matrix vector product on GPUs based on the ELLR-T approach. Parallel Comput. doi:10.1016/j.parco.2011.08.003
Vázquez F, Fernández JJ, Garzón EM (2011) A new approach for sparse matrix vector product on NVIDIA GPUs. Concurr Comput 23:815–826
Vázquez F, Garzón E, Fernández J (2011) Matrix implementation of simultaneous iterative reconstruction technique (SIRT) on GPUs. Comput J 55(11):1861–1868
Vázquez F, Ortega G, Fernández JJ, Garzón EM (2010) Improving the performance of the sparse matrix vector product with GPUs. In: CIT 2010. IEEE Comput Soc, Los Alamitos, pp 1146–1151. doi:10.1109/CIT.2010.208
Acknowledgements
This work has been funded by grants from the Spanish Ministry of Science and Innovation (TIN2008-01117) and Junta de Andalucia (P08-TIC-3518, P10-TIC-6002), in part financed by the European Regional Development Fund (ERDF).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Ortega, G., Garzón, E.M., Vázquez, F. et al. The BiConjugate gradient method on GPUs. J Supercomput 64, 49–58 (2013). https://doi.org/10.1007/s11227-012-0761-2
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-012-0761-2