Solving finite difference linear systems on GPUs: CUDA based Parallel Explicit Preconditioned Biconjugate Conjugate Gradient type Methods

Gravvanis, G. A.; Filelis-Papadopoulos, C. K.; Giannoutakis, K. M.

doi:10.1007/s11227-011-0619-z

Solving finite difference linear systems on GPUs: CUDA based Parallel Explicit Preconditioned Biconjugate Conjugate Gradient type Methods

Published: 31 May 2011

Volume 61, pages 590–604, (2012)
Cite this article

The Journal of Supercomputing Aims and scope Submit manuscript

G. A. Gravvanis¹,
C. K. Filelis-Papadopoulos¹ &
K. M. Giannoutakis²

473 Accesses
14 Citations
Explore all metrics

Abstract

During the last decades, explicit approximate inverse preconditioning methods have been used for efficiently solving sparse linear systems on multiprocessor systems. The effectiveness of explicit approximate inverse preconditioning schemes relies on the use of efficient preconditioners that are close approximants to the coefficient matrix and are fast to compute in parallel. A new parallel computational technique is proposed for the parallelization of the explicit preconditioned conjugate gradient type method on a Graphics Processing Unit (GPU). The proposed parallel methods have been implemented using Compute Unified Device Architecture (CUDA) developed by NVIDIA. The inherently parallel operations between vectors and matrices involved in the explicit preconditioned biconjugate conjugate gradient type schemes exhibit significant amounts of loop-level parallelism because of the matrix–vector and the vector–vector products that can lead to high performance gain on the GPU systems, specifically designed for such computations. Finally, numerical results for the performance of the explicit preconditioned biconjugate conjugate gradient type method for solving characteristic two dimensional boundary value problems, using the finite difference method, on a massive multiprocessor interface on a GPU are presented. The CUDA implementation issues of the proposed method are also discussed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Development of a 3D Hybrid Finite-Discrete Element Simulator Based on GPGPU-Parallelized Computation for Modelling Rock Fracturing Under Quasi-Static and Dynamic Loading Conditions

Article 04 September 2019

An energy-efficient GMRES–multigrid solver for space-time finite element computation of dynamic poroelasticity

Article Open access 13 April 2024

Parallelizing the dual revised simplex method

Article Open access 14 December 2017

References

Bolz J, Farmer I, Grinspun E, Schörder P (2003) Sparse matrix solvers on the GPU: conjugate gradients and multigrid. ACM Trans Graph 22(3):917–924
Article Google Scholar
Buatois L, Caumon G, Levy B (2009) Concurrent number cruncher: a GPU implementation of a general sparse linear solver. Int J Parallel Emergent Distributed Syst 24(3):205–223
Article MathSciNet Google Scholar
Evans DJ, Lipitakis EA (1979) A sparse LU factorization procedure for the solution of parabolic differential equations. In: Lewis RW, Morgan K (eds) Proceedings of international conference on numerical methods in thermal problems. Pineridge, Swansea, pp 954–966
Google Scholar
Giannoutakis KM, Gravvanis GA (2008) High performance finite element approximate inverse preconditioning. Appl Math Comput 201:293–304
Article MathSciNet MATH Google Scholar
Gravvanis GA (2000) Generalized approximate inverse preconditioning for solving non-linear elliptic boundary-value problems. I. J Appl Math 2(11):1363–1378
MathSciNet MATH Google Scholar
Gravvanis GA (2002) Explicit approximate inverse preconditioning techniques. Arch Comput Methods Eng 9(4):371–402
Article MATH Google Scholar
Gravvanis GA (2009) High performance inverse preconditioning. Arch Comput Methods Eng 16(1):77–108
Article MathSciNet MATH Google Scholar
Gravvanis GA, Giannoutakis KM (2008) Fast parallel finite element approximate inverses. Comput Model Eng Sci 32(1):35–44
MathSciNet MATH Google Scholar
Gravvanis GA, Filelis-Papadopoulos CK, Giannoutakis KM, Lipitakis EA (2010) Approximate inverse preconditioning using POSIX threads on multicore systems. In: Dougalis V, Gallopoulos E, Hadjidimos A, Kotsireas IS, Noutsos D, Saridakis YG, Vrahatis MN (eds) Proceedings of conference in numerical analysis (NumAn 2010)—recent approaches to numerical analysis: theory, methods and applications, pp 93–99
Google Scholar
Griebel M, Zaspel P (2010) A multi-GPU accelerated solver for the three-dimensional two-phase incompressible Navier–Stokes equations. Comput Sci Res Dev 25(1–2):65–73
Article Google Scholar
Grote MJ, Huckle T (1997) Parallel preconditioning with sparse approximate inverses. SIAM J Sci Comput 18:838–853
Article MathSciNet MATH Google Scholar
Harris M Optimizing Parallel Reduction in Cuda. http://developer.download.nvidia.com/compute/cuda/1_1/Website/projects/reduction/doc/reduction.pdf
Kirk DB, Hwu WW (2010) Programming massively parallel processors: a hands-on approach. Morgan Kaufmann, San Mateo
Google Scholar
Krüger J, Westermann R (2003) Linear algebra operators for GPU implementation of numerical algorithms. ACM Trans Graph 22(3):908–916
Article Google Scholar
Lipitakis EA, Evans DJ (1987) Explicit semi-direct methods based on approximate inverse matrix techniques for solving boundary-value problems on parallel processors. Math Comput Simul 29:1–17
Article MATH Google Scholar
NVIDIA CUDA Programming Guide, http://developer.download.nvidia.com/compute/cuda/3_1/toolkit/docs/NVIDIA_CUDA_C_ProgrammingGuide_3.1.pdf
NVIDIA Occupancy Calculator, http://developer.download.nvidia.com/compute/cuda/CUDA_Occupancy_calculator.xls
Saad Y (1996) Iterative methods for sparse linear systems. PWS Publishing, Boston
MATH Google Scholar
Saad Y, van der Vorst HA (2000) Iterative solution of linear systems in the 20th century. J Comput Appl Math 123:1–33
Article MathSciNet MATH Google Scholar
Sanders J, Kandrot E (2010) CUDA by example: an introduction to general purpose GPU programming. Addison-Wesley, Reading
Google Scholar
Smith IM, Margets L (2006) The convergence variability of parallel iterative solvers. Eng Comput 23(2):154–165
Article MATH Google Scholar
Yun JH, Kim SW (1997) Parallel implementation of hybrid iterative methods for non-symmetric linear systems. Korean J Comput Appl Math 4(1):1–16
Article MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electrical & Computer Engineering, School of Engineering, Democritus University of Thrace, University Campus, Kimmeria, 67100, Xanthi, Greece
G. A. Gravvanis & C. K. Filelis-Papadopoulos
Centre for Research and Technology Hellas, Informatics and Telematics Institute, 57001, Thermi, Greece
K. M. Giannoutakis

Authors

G. A. Gravvanis
View author publications
You can also search for this author in PubMed Google Scholar
C. K. Filelis-Papadopoulos
View author publications
You can also search for this author in PubMed Google Scholar
K. M. Giannoutakis
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to G. A. Gravvanis.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Gravvanis, G.A., Filelis-Papadopoulos, C.K. & Giannoutakis, K.M. Solving finite difference linear systems on GPUs: CUDA based Parallel Explicit Preconditioned Biconjugate Conjugate Gradient type Methods. J Supercomput 61, 590–604 (2012). https://doi.org/10.1007/s11227-011-0619-z

Download citation

Published: 31 May 2011
Issue Date: September 2012
DOI: https://doi.org/10.1007/s11227-011-0619-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Solving finite difference linear systems on GPUs: CUDA based Parallel Explicit Preconditioned Biconjugate Conjugate Gradient type Methods

Abstract

Access this article

Similar content being viewed by others

Development of a 3D Hybrid Finite-Discrete Element Simulator Based on GPGPU-Parallelized Computation for Modelling Rock Fracturing Under Quasi-Static and Dynamic Loading Conditions

An energy-efficient GMRES–multigrid solver for space-time finite element computation of dynamic poroelasticity

Parallelizing the dual revised simplex method

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Solving finite difference linear systems on GPUs: CUDA based Parallel Explicit Preconditioned Biconjugate Conjugate Gradient type Methods

Abstract

Access this article

Similar content being viewed by others

Development of a 3D Hybrid Finite-Discrete Element Simulator Based on GPGPU-Parallelized Computation for Modelling Rock Fracturing Under Quasi-Static and Dynamic Loading Conditions

An energy-efficient GMRES–multigrid solver for space-time finite element computation of dynamic poroelasticity

Parallelizing the dual revised simplex method

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation