Skip to main content
Log in

CPU/GPU computing for a multi-block structured grid based high-order flow solver on a large heterogeneous system

  • Published:
Cluster Computing Aims and scope Submit manuscript

Abstract

The high-order schemes have attracted more and more attention in computational fluid dynamics (CFD) simulations. As a kind of high-order schemes, weighted compact nonlinear schemes (WCNSs) have been widely applied in large eddy simulations, direct numerical simulations etc. However, due to the computational complexity, WCNSs require high-performance platforms. In recent years, the highly parallel graphics processing unit (GPU) is rapidly gaining maturity as a powerful engine for high performance computer. In this paper, we present a high-order double-precision solver of the three-dimensional, compressible viscous flow using multi-block structured grids on GPU clusters. The solver utilizes the high-order WCNS scheme for space discretization and Jacobi iteration method for time discretization. In order to utilize the computational capability of CPU and GPU for the solver, we present a workload balancing model for distributing workload among CPUs and GPUs. And we design two strategies to overlap computations with communications. The performance analyses show that the single-GPU solver achieves about 8× speed-ups relative to a serial computation on a CPU core. The performance results validate the workload distribution scheme. The strong and weak scaling analyses show that GPU clusters offer a significant advantage in performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Algorithm 1
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18

Similar content being viewed by others

References

  1. Harten, A.: High resolution schemes for hyperbolic conservation laws. J. Comput. Phys. 49(3), 357–393 (1983)

    Article  MATH  MathSciNet  Google Scholar 

  2. van Leer, B.: Towards the ultimate conservative difference scheme. V. A second-order sequel to Godunov’s method. J. Comput. Phys. 32(1), 101–136 (1979)

    Article  Google Scholar 

  3. Harten, A., Engquist, B., Osher, S., Chakravarthy, S.R.: Uniformly high order accurate essentially non-oscillatory schemes, III. J. Comput. Phys. 71(2), 231–303 (1987)

    Article  MATH  MathSciNet  Google Scholar 

  4. Liu, X.-D., Osher, S., Chan, T.: Weighted essentially nonoscillatory schemes. J. Comput. Phys. 115(1), 200–212 (1994)

    Article  MATH  MathSciNet  Google Scholar 

  5. Lele, S.K.: Compact finite difference schemes with spectral-like resolution. J. Comput. Phys. 103(1), 16–42 (1992)

    Article  MATH  MathSciNet  Google Scholar 

  6. Deng, X., Zhang, H.: Developing high-order weighted compact nonlinear schemes. J. Comput. Phys. 165(1), 22–44 (2000)

    Article  MATH  MathSciNet  Google Scholar 

  7. Deng, X., Liu, X., Mao, M., Zhang, H.: Investigation on weighted compact fifth-order nonlinear scheme and applications to complex flow. In: 17th AIAA Computational Fluid Dynamics Conference, June 2005. American Institute of Aeronautics and Astronautics (2005)

    Google Scholar 

  8. Ishiko, K., Ohnishi, N., Ueno, K., Sawada, K.: Implicit large eddy simulation of two-dimensional homogeneous turbulence using weighted compact nonlinear scheme. J. Fluids Eng. 131(6), 061401 (2009)

    Article  Google Scholar 

  9. Tani, H., Teramoto, S., Yamanishi, N., Okamoto, K.: A numerical study on a temporal mixing layer under transcritical conditions. Comput. Fluids (2012). doi:10.1016/j.compfluid.2012.10.022

    Google Scholar 

  10. Yang, X.-J., Liao, X.-K., Lu, K., Hu, Q.-F., Song, J.-Q., Su, J.-S.: The Tianhe-1a supercomputer: its hardware and software. J. Comput. Sci. Technol. 26, 344–351 (2011)

    Article  Google Scholar 

  11. Le, H., Cambier, J.L.: Development of a flow solver with complex kinetics on the graphic processing units. ArXiv e-prints (2011)

  12. Tutkun, B., Edis, F.O.: A gpu application for high-order compact finite difference scheme. Comput. Fluids 55(0), 29–35 (2012)

    Article  MathSciNet  Google Scholar 

  13. Esfahanian, V., Darian, H.M., Gohari, S.I.: Assessment of Weno schemes for numerical simulation of some hyperbolic equations using GPU. Comput. Fluids (2012). doi:10.1016/j.compfluid.2012.02.031

    Google Scholar 

  14. Geveler, M., Ribbrock, D., Goddeke, D., Zajac, P., Turek, S.: Towards a complete fem-based simulation toolkit on GPUs: unstructured grid finite element geometric multigrid solvers with strong smoothers based on sparse approximate inverses. Comput. Fluids (2012). doi:10.1016/j.compfluid.2012.01.025

    Google Scholar 

  15. Jacobsen, D., Thibault, J., Senocak, I.: An MPI-CUDA implementation for massively parallel incompressible flow computations on multi-GPU clusters. In: 48th AIAA Aerospace Sciences Meeting Including the New Horizons Forum and Aerospace Exposition, January 2010. American Institute of Aeronautics and Astronautics (2010)

    Google Scholar 

  16. Jacobsen, D.A., Senocak, I.: Multi-level parallelism for incompressible flow computations on GPU clusters. Parallel Comput. 39(1), 1–20 (2013)

    Article  MathSciNet  Google Scholar 

  17. Han, L., Indinger, T., Hu, X., Adams, N.: Wavelet-based adaptive multi-resolution solver on heterogeneous parallel architecture for computational fluid dynamics. Comput. Sci. Res. Dev. 26, 197–203 (2011)

    Article  Google Scholar 

  18. Antoniou, A., Karantasis, K., Polychronopoulos, E., Ekaterinaris, J.: Acceleration of a finite-difference WENO scheme for large-scale simulations on many-core architectures. In: 48th AIAA Aerospace Sciences Meeting Including the New Horizons Forum and Aerospace Exposition, January 2010. American Institute of Aeronautics and Astronautics (2010)

    Google Scholar 

  19. Appleyard, J., Drikakis, D.: Higher-order CFD and interface tracking methods on highly-parallel MPI and GPU systems. Comput. Fluids 46(1), 101–105 (2011)

    Article  MATH  MathSciNet  Google Scholar 

  20. Wang, P., Abel, T., Kaehler, R.: Adaptive mesh fluid simulations on GPU. New Astron. 15(7), 581–589 (2010)

    Article  Google Scholar 

  21. Griebel, M., Zaspel, P.: A multi-GPU accelerated solver for the three-dimensional two-phase incompressible Navier–Stokes equations. Comput. Sci. Res. Dev. 25, 65–73 (2010)

    Article  Google Scholar 

  22. Lu, F., Song, J., Cao, X., Zhu, X.: CPU/GPU computing for long-wave radiation physics on large GPU clusters. Comput. Fluids 41(0), 47–55 (2012)

    Google Scholar 

  23. Xie, M., Lu, Y., Liu, L., Cao, H., Yang, X.: Implementation and evaluation of network interface and message passing services for Tianhe-1a supercomputer. In: 2011 IEEE 19th Annual Symposium on High Performance Interconnects (HOTI), pp. 78–86 (2011)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wei Cao.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Cao, W., Xu, Cf., Wang, Zh. et al. CPU/GPU computing for a multi-block structured grid based high-order flow solver on a large heterogeneous system. Cluster Comput 17, 255–270 (2014). https://doi.org/10.1007/s10586-013-0332-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10586-013-0332-1

Keywords

Navigation