Skip to main content

A GPU Implementation for Solving the Convection Diffusion Equation Using the Local Modified SOR Method

  • Chapter
  • First Online:
Numerical Computations with GPUs

Abstract

In this chapter we describe a parallel CUDA implementation of the SOR method for the numerical solution of the Convection Diffusion equation suitable for GPUs. We demonstrate two generally applicable programming techniques, memory reordering as a means of coalescing and recomputation of stored data as a means of alleviating the memory bandwidth bottleneck and increasing the feasible problem size. We focus on the local relaxation version of SOR. In particular we apply the local Modified SOR method (LMSOR) which possesses a better rate of convergence than SOR. We present our CUDA implementations with applied optimizations of the LMSOR method focused on exploiting the computational capabilities of modern GPUs. In addition we supply performance results of GPUs based on Fermi and Kepler architectures and a contemporary quad core CPU. The CPU implementation is parallelized with OpenMP utilizing manual AVX (Advanced Vector Extensions) vectorization. The results regarding recomputation look quite promising and we expect that it will be of more significance in the near future.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 119.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 159.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Amador, G., Gomes, A.: A CUDA-based implementation of stable fluids in 3D with internal and moving boundaries. In: 2010 International Conference on Computational Science and Its Applications, pp. 118–128 (2010)

    Google Scholar 

  2. Amador, G., Gomes, A.: CUDA-based linear solvers for stable fluids. In: International Conference on Information Science and Applications (ICISA), pp. 1–8 (2010)

    Google Scholar 

  3. Anzt, H., Tomov, S., Dongarra, J., Heuveline, V.: Weighted block-asynchronous iteration on GPU-accelerated systems. In: Euro-Par 2012: Parallel Processing Workshops. Lecture Notes in Computer Science, vol. 7640, pp. 145–154 (2013)

    Article  Google Scholar 

  4. Botta, E.F., Veldman, A.E.P.: On local relaxation methods and their application to convection-diffusion equations. J. Comput. Phys. 48, 127–149 (1981)

    Article  MathSciNet  Google Scholar 

  5. Boukas, L.A., Missirlis, N.M.: The parallel local modified SOR for nonsymmetric linear systems. Int. J. Comput. Math. 68, 153–174 (1998)

    Article  MATH  MathSciNet  Google Scholar 

  6. Brandt, A.: Multi-level adaptive solutions to boundary-value problems. Math. Comput. 31(138), 333–390 (1977)

    Article  MATH  Google Scholar 

  7. Colmenares, J., Ortiz, J., Decherchi, S., Fijany, A., Rocchia, W.: Solving the linearized Poisson-Boltzmann equation on GPUs Using CUDA. In: 21st Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP), pp. 420–426 (2013)

    Google Scholar 

  8. Cotronis, Y., Konstantinidis, E., Louka, M.A., Missirlis, N.M.: Parallel SOR for solving the convection diffusion equation using GPUs with CUDA. In: EuroPar 2012 Parallel Processing, International European Conference on Parallel and Distributed Computing, Rhodos. Lecture Notes in Computer Science, vol. 7484, pp. 575–586 (2012)

    Article  Google Scholar 

  9. Czapiński, M., Thompson, C., Barnes, S.: Reducing communication overhead in multi-GPU hybrid solver for 2D Laplace equation. Int. J. Parallel Program. 1–16 (2013) DOI: 10.1007/s10766-013-0293-2

    Google Scholar 

  10. Di, P., Wu, H., Xue, J., Wang, F., Yang, C.: Parallelizing SOR for GPGPUs using alternate loop tiling. Parallel Comput. 38(6–7), 310–328 (2012)

    Article  MathSciNet  Google Scholar 

  11. Eberhart, P., Said, I., Fortin, P., Calandra, H.: Hybrid strategy for stencil computations on the APU. In: Proceedings of the 1st International Workshop on High-Performance Stencil Computations, Vienna, pp. 43–49 (2014)

    Google Scholar 

  12. Ehrlich, L.W.: An Ad-Hoc SOR method. J. Comput. Phys. 42, 31–45 (1981)

    Article  Google Scholar 

  13. Ehrlich, L.W.: The Ad-Hoc SOR method: a local relaxation scheme. In: Elliptic Problem Solvers II, pp. 257–269. Academic, New York (1984)

    Google Scholar 

  14. Gohari, S.M.I., Esfahanian, V., Moqtaderi, H.: Coalesced computations of the incompressible Navier Stokes equations over an airfoil using graphics processing units. Comput. Fluids 80, 102–115 (2013)

    Article  Google Scholar 

  15. Ha, L., Króger, J., Joshi, S., Silva, C.T.: Multiscale unbiased diffeomorphic atlas construction on multi-GPUs. In: GPU Computing Gems, pp. 771–791. Morgan Kaufmann, Los Altos (2011)

    Google Scholar 

  16. Hsieh, C.W., Kuo, S.H., Kuo, F.A., Chou, C.Y.: Solving parabolic problems using multithread and GPU. In: International Symposium on Parallel and Distributed Processing with Applications (ISPA’10), Washington, pp. 75–80 (2010)

    Google Scholar 

  17. Itu, L.M., Suciu, C., Moldoveanu, F., Postelnicu, A., Suciu, C.: GPU optimized computation of stencil based algorithms. In: 10th Roedunet International Conference (RoEduNet), pp. 1–6, 23–25 June 2011

    Google Scholar 

  18. Khajeh-Saeed, A., Blair Perot, J.: Direct numerical simulation of turbulence using GPU accelerated supercomputers. J. Comput. Phys. 235, 241–257 (2013)

    Article  MathSciNet  Google Scholar 

  19. Khronos Group: The OpenCL Specification. Khronos Group, Beaverton (2009) http://www.khronos.org/registry/cl/specs/opencl-1.0.pdf

  20. Komatsu, K., Soga, T., Egawa, R., Takizawa, H., Kobayashi, H., Takahashi, S., Sasaki, D., Nakahashi, K.: Parallel processing of the building-cube method on a GPU platform. Comput. Fluids 45(1), 122–128 (2011)

    Article  MATH  MathSciNet  Google Scholar 

  21. Konstandinidis, E., Cotronis, Y.: Accelerating the red/black SOR method using GPUs with CUDA. In: 9th International Conference on Parallel Processing and Applied Mathematics, Part I, Torun. Lecture Notes in Computer Science, vol. 7203, pp. 589–598 (2012)

    Article  Google Scholar 

  22. Konstantinidis, E., Cotronis, Y.: Graphics processing unit acceleration of the red/black SOR method. Concurr. Comput. 25(8), 1107–1120 (2013)

    Article  Google Scholar 

  23. Kosior, A., Kudela, H.: Parallel computations on GPU in 3D using the vortex particle method. Comput. Fluids 80, 423–428 (2013)

    Article  MATH  MathSciNet  Google Scholar 

  24. Kuo, C.-C.J., Levy, B., Musicus, B.R.: A local relaxation method for solving elliptic PDEs on mesh-connected arrays. SIAM J. Sci. Stat. Comput. 8(4), 550–573 (1987)

    Article  MATH  MathSciNet  Google Scholar 

  25. Li, P., Brunet, E., Namyst, R.: High performance code generation for stencil computation on heterogeneous multi-device architectures. In: HPCC-15th IEEE International Conference on High Performance Computing and Communications, Zhangjiajie (2013)

    Google Scholar 

  26. Liu, J.T., Ma, Z.S., Li,S.H., Zhao, Y.: A GPU accelerated red-black SOR algorithm for computational fluid dynamics problems. Adv. Mater. Res. 320, 335–340 (2011)

    Google Scholar 

  27. Maruyama, N., Aoki, T.: Optimizing stencil computations for NVIDIA Kepler GPUs. In: Proceedings of the 1st International Workshop on High-Performance Stencil Computations, Vienna, pp. 89–95 (2014)

    Google Scholar 

  28. Niemeyer, K., Sung, C.: Recent progress and challenges in exploiting graphics processors in computational fluid dynamics. J. Supercomput. 67(2), 528–564 (2014)

    Article  Google Scholar 

  29. NVidia: NVidia CUDA C Programming Guide v.5.0. NVidia (2012)

    Google Scholar 

  30. NVidia: NVidia CUDA C Best Practices Guide Version 5.0. NVidia (2012)

    Google Scholar 

  31. OpenMP Architecture Review Board: OpenMP Application Program Interface Version 3.0. OpenMP Architecture Review Board (2008)

    Google Scholar 

  32. Ortega, J.M., Voight, R.G.: Solution of Partial Differential Equations on Vector and Parallel Computers. SIAM, Philadelphia (1985)

    Book  MATH  Google Scholar 

  33. Thibault, J., Senocak, I.: Accelerating incompressible flow computations with a Pthreads-CUDA implementation on small-footprint multi-GPU platforms. J. Supercomput. 59(2), 693–719 (2012)

    Article  Google Scholar 

  34. Vandal, N.A., Savvides, M.: CUDA accelerated illumination preprocessing on GPUs. In: 17th International Conference on Digital Signal Processing (DSP), pp. 1–6 (2011)

    Google Scholar 

  35. Varga, R.S.: Matrix Iterative Analysis. Prentice-Hall, Englewood Cliffs (1962)

    Google Scholar 

  36. Young, D.M.: Iterative Solution of Large Linear Systems. Academic, New York (1971)

    MATH  Google Scholar 

  37. Zaspel, P., Griebel, M.: Solving incompressible two-phase flows on multi-GPU clusters. Comput. Fluids 80, 356–364 (2013)

    Article  MATH  MathSciNet  Google Scholar 

Download references

Acknowledgements

We would like to acknowledge the kind permission of the Innovative Computing Laboratory at the University of Tennessee to use their NVidia Tesla S2050 installation for the purpose of this work.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Elias Konstantinidis .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Cotronis, Y., Konstantinidis, E., Missirlis, N.M. (2014). A GPU Implementation for Solving the Convection Diffusion Equation Using the Local Modified SOR Method. In: Kindratenko, V. (eds) Numerical Computations with GPUs. Springer, Cham. https://doi.org/10.1007/978-3-319-06548-9_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-06548-9_10

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-06547-2

  • Online ISBN: 978-3-319-06548-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics