Abstract
The high-order schemes have attracted more and more attention in computational fluid dynamics (CFD) simulations. As a kind of high-order schemes, weighted compact nonlinear schemes (WCNSs) have been widely applied in large eddy simulations, direct numerical simulations etc. However, due to the computational complexity, WCNSs require high-performance platforms. In recent years, the highly parallel graphics processing unit (GPU) is rapidly gaining maturity as a powerful engine for high performance computer. In this paper, we present a high-order double-precision solver of the three-dimensional, compressible viscous flow using multi-block structured grids on GPU clusters. The solver utilizes the high-order WCNS scheme for space discretization and Jacobi iteration method for time discretization. In order to utilize the computational capability of CPU and GPU for the solver, we present a workload balancing model for distributing workload among CPUs and GPUs. And we design two strategies to overlap computations with communications. The performance analyses show that the single-GPU solver achieves about 8× speed-ups relative to a serial computation on a CPU core. The performance results validate the workload distribution scheme. The strong and weak scaling analyses show that GPU clusters offer a significant advantage in performance.
Similar content being viewed by others
References
Harten, A.: High resolution schemes for hyperbolic conservation laws. J. Comput. Phys. 49(3), 357–393 (1983)
van Leer, B.: Towards the ultimate conservative difference scheme. V. A second-order sequel to Godunov’s method. J. Comput. Phys. 32(1), 101–136 (1979)
Harten, A., Engquist, B., Osher, S., Chakravarthy, S.R.: Uniformly high order accurate essentially non-oscillatory schemes, III. J. Comput. Phys. 71(2), 231–303 (1987)
Liu, X.-D., Osher, S., Chan, T.: Weighted essentially nonoscillatory schemes. J. Comput. Phys. 115(1), 200–212 (1994)
Lele, S.K.: Compact finite difference schemes with spectral-like resolution. J. Comput. Phys. 103(1), 16–42 (1992)
Deng, X., Zhang, H.: Developing high-order weighted compact nonlinear schemes. J. Comput. Phys. 165(1), 22–44 (2000)
Deng, X., Liu, X., Mao, M., Zhang, H.: Investigation on weighted compact fifth-order nonlinear scheme and applications to complex flow. In: 17th AIAA Computational Fluid Dynamics Conference, June 2005. American Institute of Aeronautics and Astronautics (2005)
Ishiko, K., Ohnishi, N., Ueno, K., Sawada, K.: Implicit large eddy simulation of two-dimensional homogeneous turbulence using weighted compact nonlinear scheme. J. Fluids Eng. 131(6), 061401 (2009)
Tani, H., Teramoto, S., Yamanishi, N., Okamoto, K.: A numerical study on a temporal mixing layer under transcritical conditions. Comput. Fluids (2012). doi:10.1016/j.compfluid.2012.10.022
Yang, X.-J., Liao, X.-K., Lu, K., Hu, Q.-F., Song, J.-Q., Su, J.-S.: The Tianhe-1a supercomputer: its hardware and software. J. Comput. Sci. Technol. 26, 344–351 (2011)
Le, H., Cambier, J.L.: Development of a flow solver with complex kinetics on the graphic processing units. ArXiv e-prints (2011)
Tutkun, B., Edis, F.O.: A gpu application for high-order compact finite difference scheme. Comput. Fluids 55(0), 29–35 (2012)
Esfahanian, V., Darian, H.M., Gohari, S.I.: Assessment of Weno schemes for numerical simulation of some hyperbolic equations using GPU. Comput. Fluids (2012). doi:10.1016/j.compfluid.2012.02.031
Geveler, M., Ribbrock, D., Goddeke, D., Zajac, P., Turek, S.: Towards a complete fem-based simulation toolkit on GPUs: unstructured grid finite element geometric multigrid solvers with strong smoothers based on sparse approximate inverses. Comput. Fluids (2012). doi:10.1016/j.compfluid.2012.01.025
Jacobsen, D., Thibault, J., Senocak, I.: An MPI-CUDA implementation for massively parallel incompressible flow computations on multi-GPU clusters. In: 48th AIAA Aerospace Sciences Meeting Including the New Horizons Forum and Aerospace Exposition, January 2010. American Institute of Aeronautics and Astronautics (2010)
Jacobsen, D.A., Senocak, I.: Multi-level parallelism for incompressible flow computations on GPU clusters. Parallel Comput. 39(1), 1–20 (2013)
Han, L., Indinger, T., Hu, X., Adams, N.: Wavelet-based adaptive multi-resolution solver on heterogeneous parallel architecture for computational fluid dynamics. Comput. Sci. Res. Dev. 26, 197–203 (2011)
Antoniou, A., Karantasis, K., Polychronopoulos, E., Ekaterinaris, J.: Acceleration of a finite-difference WENO scheme for large-scale simulations on many-core architectures. In: 48th AIAA Aerospace Sciences Meeting Including the New Horizons Forum and Aerospace Exposition, January 2010. American Institute of Aeronautics and Astronautics (2010)
Appleyard, J., Drikakis, D.: Higher-order CFD and interface tracking methods on highly-parallel MPI and GPU systems. Comput. Fluids 46(1), 101–105 (2011)
Wang, P., Abel, T., Kaehler, R.: Adaptive mesh fluid simulations on GPU. New Astron. 15(7), 581–589 (2010)
Griebel, M., Zaspel, P.: A multi-GPU accelerated solver for the three-dimensional two-phase incompressible Navier–Stokes equations. Comput. Sci. Res. Dev. 25, 65–73 (2010)
Lu, F., Song, J., Cao, X., Zhu, X.: CPU/GPU computing for long-wave radiation physics on large GPU clusters. Comput. Fluids 41(0), 47–55 (2012)
Xie, M., Lu, Y., Liu, L., Cao, H., Yang, X.: Implementation and evaluation of network interface and message passing services for Tianhe-1a supercomputer. In: 2011 IEEE 19th Annual Symposium on High Performance Interconnects (HOTI), pp. 78–86 (2011)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Cao, W., Xu, Cf., Wang, Zh. et al. CPU/GPU computing for a multi-block structured grid based high-order flow solver on a large heterogeneous system. Cluster Comput 17, 255–270 (2014). https://doi.org/10.1007/s10586-013-0332-1
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10586-013-0332-1