A GPU Implementation for Solving the Convection Diffusion Equation Using the Local Modified SOR Method

Cotronis, Yiannis; Konstantinidis, Elias; Missirlis, Nikolaos M.

doi:10.1007/978-3-319-06548-9_10

Yiannis Cotronis²,
Elias Konstantinidis² &
Nikolaos M. Missirlis²

3040 Accesses
2 Citations

Abstract

In this chapter we describe a parallel CUDA implementation of the SOR method for the numerical solution of the Convection Diffusion equation suitable for GPUs. We demonstrate two generally applicable programming techniques, memory reordering as a means of coalescing and recomputation of stored data as a means of alleviating the memory bandwidth bottleneck and increasing the feasible problem size. We focus on the local relaxation version of SOR. In particular we apply the local Modified SOR method (LMSOR) which possesses a better rate of convergence than SOR. We present our CUDA implementations with applied optimizations of the LMSOR method focused on exploiting the computational capabilities of modern GPUs. In addition we supply performance results of GPUs based on Fermi and Kepler architectures and a contemporary quad core CPU. The CPU implementation is parallelized with OpenMP utilizing manual AVX (Advanced Vector Extensions) vectorization. The results regarding recomputation look quite promising and we expect that it will be of more significance in the near future.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 119.00; Price excludes VAT (USA)

Hardcover Book: USD 159.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Amador, G., Gomes, A.: A CUDA-based implementation of stable fluids in 3D with internal and moving boundaries. In: 2010 International Conference on Computational Science and Its Applications, pp. 118–128 (2010)
Google Scholar
Amador, G., Gomes, A.: CUDA-based linear solvers for stable fluids. In: International Conference on Information Science and Applications (ICISA), pp. 1–8 (2010)
Google Scholar
Anzt, H., Tomov, S., Dongarra, J., Heuveline, V.: Weighted block-asynchronous iteration on GPU-accelerated systems. In: Euro-Par 2012: Parallel Processing Workshops. Lecture Notes in Computer Science, vol. 7640, pp. 145–154 (2013)
Article Google Scholar
Botta, E.F., Veldman, A.E.P.: On local relaxation methods and their application to convection-diffusion equations. J. Comput. Phys. 48, 127–149 (1981)
Article MathSciNet Google Scholar
Boukas, L.A., Missirlis, N.M.: The parallel local modified SOR for nonsymmetric linear systems. Int. J. Comput. Math. 68, 153–174 (1998)
Article MATH MathSciNet Google Scholar
Brandt, A.: Multi-level adaptive solutions to boundary-value problems. Math. Comput. 31(138), 333–390 (1977)
Article MATH Google Scholar
Colmenares, J., Ortiz, J., Decherchi, S., Fijany, A., Rocchia, W.: Solving the linearized Poisson-Boltzmann equation on GPUs Using CUDA. In: 21st Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP), pp. 420–426 (2013)
Google Scholar
Cotronis, Y., Konstantinidis, E., Louka, M.A., Missirlis, N.M.: Parallel SOR for solving the convection diffusion equation using GPUs with CUDA. In: EuroPar 2012 Parallel Processing, International European Conference on Parallel and Distributed Computing, Rhodos. Lecture Notes in Computer Science, vol. 7484, pp. 575–586 (2012)
Article Google Scholar
Czapiński, M., Thompson, C., Barnes, S.: Reducing communication overhead in multi-GPU hybrid solver for 2D Laplace equation. Int. J. Parallel Program. 1–16 (2013) DOI: 10.1007/s10766-013-0293-2
Google Scholar
Di, P., Wu, H., Xue, J., Wang, F., Yang, C.: Parallelizing SOR for GPGPUs using alternate loop tiling. Parallel Comput. 38(6–7), 310–328 (2012)
Article MathSciNet Google Scholar
Eberhart, P., Said, I., Fortin, P., Calandra, H.: Hybrid strategy for stencil computations on the APU. In: Proceedings of the 1st International Workshop on High-Performance Stencil Computations, Vienna, pp. 43–49 (2014)
Google Scholar
Ehrlich, L.W.: An Ad-Hoc SOR method. J. Comput. Phys. 42, 31–45 (1981)
Article Google Scholar
Ehrlich, L.W.: The Ad-Hoc SOR method: a local relaxation scheme. In: Elliptic Problem Solvers II, pp. 257–269. Academic, New York (1984)
Google Scholar
Gohari, S.M.I., Esfahanian, V., Moqtaderi, H.: Coalesced computations of the incompressible Navier Stokes equations over an airfoil using graphics processing units. Comput. Fluids 80, 102–115 (2013)
Article Google Scholar
Ha, L., Króger, J., Joshi, S., Silva, C.T.: Multiscale unbiased diffeomorphic atlas construction on multi-GPUs. In: GPU Computing Gems, pp. 771–791. Morgan Kaufmann, Los Altos (2011)
Google Scholar
Hsieh, C.W., Kuo, S.H., Kuo, F.A., Chou, C.Y.: Solving parabolic problems using multithread and GPU. In: International Symposium on Parallel and Distributed Processing with Applications (ISPA’10), Washington, pp. 75–80 (2010)
Google Scholar
Itu, L.M., Suciu, C., Moldoveanu, F., Postelnicu, A., Suciu, C.: GPU optimized computation of stencil based algorithms. In: 10th Roedunet International Conference (RoEduNet), pp. 1–6, 23–25 June 2011
Google Scholar
Khajeh-Saeed, A., Blair Perot, J.: Direct numerical simulation of turbulence using GPU accelerated supercomputers. J. Comput. Phys. 235, 241–257 (2013)
Article MathSciNet Google Scholar
Khronos Group: The OpenCL Specification. Khronos Group, Beaverton (2009) http://www.khronos.org/registry/cl/specs/opencl-1.0.pdf
Komatsu, K., Soga, T., Egawa, R., Takizawa, H., Kobayashi, H., Takahashi, S., Sasaki, D., Nakahashi, K.: Parallel processing of the building-cube method on a GPU platform. Comput. Fluids 45(1), 122–128 (2011)
Article MATH MathSciNet Google Scholar
Konstandinidis, E., Cotronis, Y.: Accelerating the red/black SOR method using GPUs with CUDA. In: 9th International Conference on Parallel Processing and Applied Mathematics, Part I, Torun. Lecture Notes in Computer Science, vol. 7203, pp. 589–598 (2012)
Article Google Scholar
Konstantinidis, E., Cotronis, Y.: Graphics processing unit acceleration of the red/black SOR method. Concurr. Comput. 25(8), 1107–1120 (2013)
Article Google Scholar
Kosior, A., Kudela, H.: Parallel computations on GPU in 3D using the vortex particle method. Comput. Fluids 80, 423–428 (2013)
Article MATH MathSciNet Google Scholar
Kuo, C.-C.J., Levy, B., Musicus, B.R.: A local relaxation method for solving elliptic PDEs on mesh-connected arrays. SIAM J. Sci. Stat. Comput. 8(4), 550–573 (1987)
Article MATH MathSciNet Google Scholar
Li, P., Brunet, E., Namyst, R.: High performance code generation for stencil computation on heterogeneous multi-device architectures. In: HPCC-15th IEEE International Conference on High Performance Computing and Communications, Zhangjiajie (2013)
Google Scholar
Liu, J.T., Ma, Z.S., Li,S.H., Zhao, Y.: A GPU accelerated red-black SOR algorithm for computational fluid dynamics problems. Adv. Mater. Res. 320, 335–340 (2011)
Google Scholar
Maruyama, N., Aoki, T.: Optimizing stencil computations for NVIDIA Kepler GPUs. In: Proceedings of the 1st International Workshop on High-Performance Stencil Computations, Vienna, pp. 89–95 (2014)
Google Scholar
Niemeyer, K., Sung, C.: Recent progress and challenges in exploiting graphics processors in computational fluid dynamics. J. Supercomput. 67(2), 528–564 (2014)
Article Google Scholar
NVidia: NVidia CUDA C Programming Guide v.5.0. NVidia (2012)
Google Scholar
NVidia: NVidia CUDA C Best Practices Guide Version 5.0. NVidia (2012)
Google Scholar
OpenMP Architecture Review Board: OpenMP Application Program Interface Version 3.0. OpenMP Architecture Review Board (2008)
Google Scholar
Ortega, J.M., Voight, R.G.: Solution of Partial Differential Equations on Vector and Parallel Computers. SIAM, Philadelphia (1985)
Book MATH Google Scholar
Thibault, J., Senocak, I.: Accelerating incompressible flow computations with a Pthreads-CUDA implementation on small-footprint multi-GPU platforms. J. Supercomput. 59(2), 693–719 (2012)
Article Google Scholar
Vandal, N.A., Savvides, M.: CUDA accelerated illumination preprocessing on GPUs. In: 17th International Conference on Digital Signal Processing (DSP), pp. 1–6 (2011)
Google Scholar
Varga, R.S.: Matrix Iterative Analysis. Prentice-Hall, Englewood Cliffs (1962)
Google Scholar
Young, D.M.: Iterative Solution of Large Linear Systems. Academic, New York (1971)
MATH Google Scholar
Zaspel, P., Griebel, M.: Solving incompressible two-phase flows on multi-GPU clusters. Comput. Fluids 80, 356–364 (2013)
Article MATH MathSciNet Google Scholar

Download references

Acknowledgements

We would like to acknowledge the kind permission of the Innovative Computing Laboratory at the University of Tennessee to use their NVidia Tesla S2050 installation for the purpose of this work.

Author information

Authors and Affiliations

Department of Informatics and Telecommunications, University of Athens, Panepistimiopolis, 15784, Athens, Greece
Yiannis Cotronis, Elias Konstantinidis & Nikolaos M. Missirlis

Authors

Yiannis Cotronis
View author publications
You can also search for this author in PubMed Google Scholar
Elias Konstantinidis
View author publications
You can also search for this author in PubMed Google Scholar
Nikolaos M. Missirlis
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Elias Konstantinidis .

Editor information

Editors and Affiliations

National Center for Supercomputing Applications, University of Illinois, Urbana, Illinois, USA
Volodymyr Kindratenko

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Cotronis, Y., Konstantinidis, E., Missirlis, N.M. (2014). A GPU Implementation for Solving the Convection Diffusion Equation Using the Local Modified SOR Method. In: Kindratenko, V. (eds) Numerical Computations with GPUs. Springer, Cham. https://doi.org/10.1007/978-3-319-06548-9_10

Download citation

DOI: https://doi.org/10.1007/978-3-319-06548-9_10
Published: 09 June 2014
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-06547-2
Online ISBN: 978-3-319-06548-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics