Skip to main content
Log in

CU++: an object oriented framework for computational fluid dynamics applications using graphics processing units

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

The application of graphics processing units (GPU) to solve partial differential equations is gaining popularity with the advent of improved computer hardware. Various lower level interfaces exist that allow the user to access GPU specific functions. One such interface is NVIDIA’s Compute Unified Device Architecture (CUDA) library. However, porting existing codes to run on the GPU requires the user to write kernels that execute on multiple cores, in the form of Single Instruction Multiple Data (SIMD). In the present work, a higher level framework, termed CU++, has been developed that uses object oriented programming techniques available in C++ such as polymorphism, operator overloading, and template meta programming. Using this approach, CUDA kernels can be generated automatically during compile time. Briefly, CU++ allows a code developer with just C/C++ knowledge to write computer programs that will execute on the GPU without any knowledge of specific programming techniques in CUDA. This approach is tremendously beneficial for Computational Fluid Dynamics (CFD) code development because it mitigates the necessity of creating hundreds of GPU kernels for various purposes. In its current form, CU++ provides a framework for parallel array arithmetic, simplified data structures to interface with the GPU, and smart array indexing. An implementation of heterogeneous parallelism, i.e., utilizing multiple GPUs to simultaneously process a partitioned grid system with communication at the interfaces using Message Passing Interface (MPI) has been developed and tested.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Listing 1
Listing 2
Listing 3
Fig. 1
Listing 4
Fig. 2
Fig. 3
Listing 5
Fig. 4
Listing 6
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

References

  1. Cohen JM, Molemaker MJ (2009) A fast double precision code using CUDA. In: Proceedings of parallel CFD, Moffett Field, CA

    Google Scholar 

  2. General-purpose computation on graphics hardware. http://gpgpu.org

  3. Hagen TR, Lie K-A, Natvig JR (2006) Solving the Euler Equations on Graphics Processing Units/ In. Lecture Notes in Computer Science, vol 3994. Springer, Berlin, pp 220–227

    Google Scholar 

  4. Elsen E, LeGresley P, Darve E (2008) Large calculation of the flow over a hypersonic vehicle using a GPU. J Comput Phys 227(24):10148–10161

    Article  MATH  Google Scholar 

  5. Brandvik T, Pullan G (2008) Acceleration of a 3D Euler solver using commodity graphics hardware. 46th AIAA aerospace sciences meeting and exhibit, AIAA-2008-0607, Reno, NV

    Google Scholar 

  6. Buck I (2003) Data parallel computing on graphics hardware. Graphics Hardware

  7. NVIDIA CUDA C programming Guide 4.0. http://developer.nvidia.com/cuda-toolkit-40

  8. Phillips EH, Zhang Y, Davis RL, Owens JD (2009) Rapid aerodynamic performance prediction on a cluster of graphics processing units. In: 47th aerospace sciences meeting and exhibit, AIAA-2009-0565, Orlando, FL

    Google Scholar 

  9. Bailey P, Myre J, Walsh SDC, Lilja DJ (2009) Accelerating lattice Boltzmann fluid flow simulations using graphics processors. In: Parallel processing, Vienna, Austria, pp 550–557. doi:10.1109/ICPP.2009.38

    Google Scholar 

  10. NAS parallel benchmarks. http://www.nas.nasa.gov/publications/npb.html. Accessed 10 June 2013

  11. Lu F, Song J, Cao X, Zhu X (2011) Acceleration for CFD applications on large GPU clusters: an NPB case study. In: Computer sciences and convergence information technology, Seogwipo, South Korea, pp 534–538. ISBN:978-1-4577-0472-7

    Google Scholar 

  12. Vandevoorde D, Josuttis N (2003) C++ templates: the complete guide. Pearson Education, Upper Sadle River

    Google Scholar 

  13. Cohen J (2012) Processing device arrays with C++ metaprogramming. In: GPU computing gems, Jade edition. Morgan Kaufmann, San Mateo. doi:10.1016/B978-0-12-385963-1.00044-7

    Google Scholar 

  14. Chen J, Joo B, Watson W, Edwards R (2012) Automatic offloading C++ expression templates to CUDA enabled GPUs. In: Parallel and distributed processing symposium workshops and PhD forum, Shanghai, China, pp 2359–2368. doi:10.1109/IPDPSW.2012.293

    Google Scholar 

  15. Enmyren J, Kessler CW (2010) SkePU: A multi-backend skeleton programming library for multi-GPU systems. In: Proc 4th int workshop on high-level parallel programming and applications (HLPP-2010), Baltimore, Maryland, USA, September 2010. ACM, New York

    Google Scholar 

  16. Corrigan A, Camelli F, Lohner R, Mut F (2011) Semi-automatic porting of a large-scale Fortran CFD code to GPUs. Int J Numer Methods Fluids 69(6):314–331

    Google Scholar 

  17. Poole D (2012) Introduction to OpenACC directives. In: NVIDIA GPU technology conference

    Google Scholar 

  18. Quinlan D (2000) A++P++ manual. UCRL Report No: UCRL-MA-136511, Lawrence Livermore National Laboratory

  19. Brown DL, Chesshire GS, Henshaw WD, Quinlan DJ (1997) Overture: an object oriented software system for solving partial differential equations in serial and parallel environments. In: Eighth conference on parallel processing for scientific computing. Society for Industrial and Applied Mathematics, Paper CP97

    Google Scholar 

  20. Chandar D, Damodaran M (2008) Computational study of unsteady low Reynolds number airfoil aerodynamics on moving overlapping meshes. AIAA J 46(2):429–438

    Article  Google Scholar 

  21. Chandar D, Damodaran M (2010) Numerical study of the free flight characteristics of a flapping wing in low Reynolds numbers. J Aircr 47(1):141–150

    Article  Google Scholar 

  22. Chandar D, Damodaran M (2009) Computation of low Reynolds number flexible flapping wing aerodynamics on overlapping grids. AIAA 2009-1273, presented at the 47th AIAA aerospace sciences meeting and exhibit, Orlando, FL, USA, January 2009

  23. Pulliam TH (1984) Euler and thin layer Navier–Stokes codes: ARC2D, ARC3D. UTSI E02-4005-023-84. Computational fluid dynamics, University of Tennessee Space Institute

  24. Sankaran V, Sitaraman J, Wissink A, Datta A, Jayaraman B, Potsdam M, Mavriplis D, Yang Z, O’Brien D, Saberi H, Cheng R, Hariharan N, Strawn R (2010) Application of the Helios computational platform to rotorcraft flowfields. In: 48th AIAA aerospace sciences meeting and exhibit, AIAA-2010-1230, Orlando, FL

    Google Scholar 

  25. Soni K, Chandar DDJ, Sitaraman J (2011) Development of an overset grid computational fluid dynamics solver on graphical processing units. In: 49th AIAA aerospace sciences meeting and exhibit, AIAA-2011-1268, Orlando, FL

    Google Scholar 

  26. Chandar D, Sitaraman J, Mavriplis D (2012) Dynamic overset grid computations for CFD applications on graphics processing units. Paper ICCFD7-12-2. In: Proceedings of the international conference on computational fluid dynamics, Big Island, Hawaii

    Google Scholar 

  27. Kennedy CA, Carpenter MH, Lewis RM (1999) Low-storage, explicit Runge–Kutta schemes for the compressible Navier–Stokes equations. NASA/CR 1999-209349

  28. Henshaw WD (2011) Cgins reference manual: an overture solver for the incompressible Navier–Stokes equations on composite overlapping grids. Lawrence Livermore National Laboratory Report LLNL-SM-455871, 2011

  29. Crumpton PI, Moinier P, Giles MB (1997) An unstructured algorithm for high Reynolds number flows on highly stretched grids. In: Numerical methods in laminar and turbulent flow. Pineridge Press, Whiting, pp 561–572

    Google Scholar 

  30. Chandar D, Sitaraman J, Mavriplis DJ (2012) On the integral constraint of the pressure Poisson equation for incompressible flows on an unstructured grid. Int J Comput Fluid Dyn. doi:10.1080/10618562.2012.723127

    MathSciNet  Google Scholar 

  31. NVIDIA GPUDirect Technology, Mellanox technologies white paper, http://www.mellanox.com/pdf/whitepapers/TB_GPU_Direct.pdf. Accessed 25 July 2012

  32. Jones KD, Dohring CM, Platzer MF (1998) Experimental and computational investigation of the Knoller–Betz effect. AIAA J 36(7):1240–1246

    Article  Google Scholar 

  33. Tuncer IH, Kaya M (2003) Thrust generation caused by flapping airfoils in a biplane configuration. J Aircr 40:509–515

    Google Scholar 

  34. Chandar D, Sitaraman J, Mavriplis DJ (2013) Overset grid based computations for rotary wing flows on GPU architectures. Presented at the American helicopter society forum, AHS69, May 2013

Download references

Acknowledgements

We gratefully acknowledge support from the Office of Naval Research under ONR Grant N00014-09-1-1060.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dominic D. J. Chandar.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chandar, D.D.J., Sitaraman, J. & Mavriplis, D. CU++: an object oriented framework for computational fluid dynamics applications using graphics processing units. J Supercomput 67, 47–68 (2014). https://doi.org/10.1007/s11227-013-0985-9

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-013-0985-9

Keywords

Navigation