GPU-based flow simulation with detailed chemical kinetics
Introduction
Detail study of complex physical phenomena associated with high-speed fluid flow requires a deep understanding of the fundamental physics which involves many highly non-linear processes evolving in different spatial and temporal scales. Inevitably, the challenge of solving these problems is due to the coupling mechanism between these processes and what impact they have on the flow solution. For example, in combustion study, one has to pay close attention to the coupling between the fluid transport and chemical kinetics in order to characterize the combustion process. Computational Fluid Dynamics (CFD) techniques can be used to obtain a detailed flow solution which can be applied in practical applications. Although the mathematical formulation of the physics can be addressed in great detail, numerical simulation of high-speed fluid flow in a non-equilibrium environment is often limited by the computational power demanded for solving the governing equations. Modern CFD codes are designed to take advantage of high performance computing (HPC) platform to reduce run time. Unfortunately, the tradition HPC resources are very limited due to their cost and maintenance requirement. These limitations have accentuated a need for a compact and low-cost HPC solution where numerical solvers can be effectively implemented.
During the last eight years, the Graphic Processing Unit (GPU) has been introduced as a promising alternative to high-cost HPC platforms. Within this period, the GPU has evolved into a highly capable and low-cost computing solution for scientific research. Fig. 1 illustrates the superiority of GPU over the traditional Central Processing Unit (CPU) in terms of floating point calculation. This is due to the fact that the GPU is designed for the highly parallel process of graphic rendering. Starting in 2008, the GPU began to support double precision calculation, which is necessary for scientific computing. The newest generation of NVIDIA GPUs called “Fermi” has been designed to enhance the performance on double precision calculation over the previous generations of NVIDIA GPUs. CUDA [1], which is currently the most popular programming environment for general purpose GPU computing, has undergone several development phases and reached a certain level of maturity, which is essential for the design of numerical solvers. Other alternatives such as OpenCL, DirectCompute, Stream, etc. have also began to mature which encourages new developments in scientific computing using GPU. Several attempts had been made in writing scientific codes on the GPU either by directly using CUDA or via some wrapper which calls the CUDA kernel functions, and promising results were obtained both in terms of performance and flexibility. Previous implementations of CFD codes on the GPU focused purely on solving the fluid dynamics using either finite volume [2], [3] or finite element methods [4]. Recently, there have been a number of GPU-based implementations of multiphysics simulation. Of particular note are the extension to magnetohydrodynamics simulation by Wong et al. [5] and the automated preprocessor tool to model finite rate chemical kinetics by Linford et al. [6], [7]. In all cases, the reported speed-ups show promising performance results and clearly demonstrate that GPU is suitable for massively parallel scientific computing.
In this paper, we describe the detail code implementation of a numerical solver coupling the fluid dynamics with detailed chemical kinetics. Unlike previous attempts at adapting the GPU to numerical solver, we have placed our emphasis on the kinetics solver rather than the fluid dynamics. This is due to the fact that for the simulation of high-speed fluid flow, the computation is dominated by solving the kinetics. While the current implementation is only for chemical kinetics, it is easy to extend it to a more general kinetics (collision-radiative kinetics for plasma) since all the elementary processes for a plasma (excitation/de-excitation, ionization/recombination, etc.) can be represented by a chemical reaction with the rate computed a priori and tabulated as a function of temperature.
The rest of the paper is organized as follows. The governing equation and numerical formulation for both the fluid dynamics and chemical kinetics are described in Sections 2 Governing equations, 3 Numerical formulation. Section 4 gives some background on GPU computing. The code implementation is detailed in Section 5, highlighting several optimization techniques for maximizing the performance of the solver. The results of several benchmark test cases both for non-reactive and reactive flow fields are presented in Section 6 as well as the performance results of the solver. Section 7 gives the conclusions and points out possible future works.
Section snippets
Governing equations
The flow is modeled as a mixture of gas species while neglecting viscous effects. The chemical reactions taken place between the gas components are to be modeled in great detail. The set of the Euler equations for a reactive gas mixture can be written as: where and are the vectors of conservative variables and inviscid fluxes, respectively. We assumed that there is no species diffusion and the gas is thermally equilibrium (i.e., all species have the same velocity and all the
Fluid dynamics
A dimensional splitting technique [8] is utilized for solving the convective part of the governing equations. In order to achieve high-order both in space and time, we employed a fifth-order Monotonicity-Preserving scheme (MP5) [9] for the reconstruction, and a third-order Runge–Kutta (RK3) for time integration. For the MP5 scheme, the reconstructed value of the left and right states of interface is given as (see Fig. 2):
GPU computing
The GPU processes data in a Single-Instruction-Multiple-Thread (SIMT) manner. The instruction for executing on the GPU is called a kernel which is invoked from the host (CPU). The CUDA programming model consists of grid and thread block. A grid consists of multiple thread blocks and each thread block contains a number of threads. When a kernel is called, the scheduler unit on the device will automatically assign a group of thread blocks to the number of available streaming multi-processors (SM
Implementation
The overall implementation of the code can be divided into two parts: the fluid dynamics and kinetics. The fluid dynamics module is responsible for the advection calculation. The kinetics module, on the other hand, calculates the species consumption/production due to chemical reactions and ensures detail balance is satisfied. The overall flow chart of the program is shown in Fig. 4 (see also Fig. 5 for the flow chart of the fluid dynamics module). After all the flow variables have been
Solver results
The first objective is to verify that the solver is correctly implemented using the CUDA kernels. For this purpose, we can compare the results with a pure-CPU version, but also compute a set of standard test cases. The first of those is a Mach 3 wind tunnel problem (a.k.a. the forward step problem) using the MP5 scheme, whose solution is shown in Fig. 8. This problem had been utilized by Woodward and Colella [14] to test a variety of numerical schemes. The whole domain is initialized with
Conclusion and future works
In the current paper, we described the implementation of a numerical solver for simulating chemically reacting flow on the GPU. The fluid dynamics is modeled using high-order shock-capturing schemes, and the chemical kinetics is solved using an implicit solver. Results of both the fluid dynamics and chemical kinetics are shown. Considering only the fluid dynamics, we obtained a speed-up of 30 and 55 times compared to the CPU version for the MP5 and ADERWENO scheme, respectively. For the
Acknowledgment
The authors would like to thank Prof. Ann Karagozian of UCLA for countless support on performing simulation on the Hoffman2 GPU cluster.
References (22)
- et al.
Large calculation of the flow over a hypersonic vehicle using a GPU
J. Comput. Phys.
(2008) - et al.
Nodal discontinuous Galerkin methods on graphics processors
J. Comput. Phys.
(2009) - et al.
Efficient magnetohydrodynamic simulations on graphics processing units with CUDA
Comput. Phys. Comm.
(2011) - et al.
Accurate monotonicity-preserving schemes with Runge–Kutta time stepping
J. Comput. Phys.
(1997) - et al.
ADER schemes for three-dimensional nonlinear hyperbolic systems
J. Comput. Phys.
(2005) - et al.
On Godunov-type methods near low densities
J. Comput. Phys.
(1991) - et al.
The numerical simulation of two-dimensional fluid flow with strong shocks
J. Comput. Phys.
(1984) - NVIDIA Corporation, Compute Unified Device Architecture Programming Guide version 4.0,...
- T. Brandvik, G. Pullan, Acceleration of a 3D Euler Solver using Commodity Graphics Hardware, in: 46th AIAA Aerospace...
- et al.
Automatic generation of multi-core chemical kernels
IEEE Transactions on Parallel and Distributed Systems
(2011)
Cited by (23)
TChem: A performance portable parallel software toolkit for complex kinetic mechanisms
2023, Computer Physics CommunicationsCitation Excerpt :Outside the combustion community, CAMP (Chemistry Across Multiple Phases) [16] provides tools for atmospheric chemistry and is currently under development to accommodate heterogeneous computing platforms with GPU accelerators. Several research efforts explored the utilization of GPUs to handle complex kinetic models for combustion [17–28] and atmospheric chemistry science [29,30]. Spafford [20] developed GPU kernels for evaluating the rates of chemical reactions, mapping a single GPU thread to each spatial grid point and a block of threads to each computational sub-domain.
Accelerating turbulent reacting flow simulations on many-core/GPUs using matrix-based kinetics
2023, Proceedings of the Combustion InstituteAn investigation of hybrid CPU-GPU solvers for supersonic reacting flow simulation with detailed chemical kinetics
2022, Aerospace Science and TechnologyProgram package MPGOS: Challenges and solutions during the integration of a large number of independent ODE systems using GPUs
2021, Communications in Nonlinear Science and Numerical SimulationAdvanced model for the interaction of a Ti plume produced by a ns-pulsed laser in a nitrogen environment
2021, Spectrochimica Acta - Part B Atomic SpectroscopyCitation Excerpt :In recent years a 2D fluid dynamic code implementing the state-to-state kinetics has been applied to the entry in the Earth atmosphere [41–44], considerably speeding up the calculation with the aid of GPU's (Graphical Processing Unit), allowing 2D state-to-state modeling also on a desktop computer. The present algorithm reached a speed-up of 100, almost independently of the number of species, proving to be more effective than previous approaches on GPU accelerated reactive fluid dynamics codes, see e.g. [45]. The purpose of the present paper is to investigate the role of vibrational non-equilibrium in the evolution of a titanium plasma plume expanding in the nitrogen environment, using the 2D computing platform mentioned above.