CPU/GPU computing for a multi-block structured grid based high-order flow solver on a large heterogeneous system

Cao, Wei; Xu, Chuan-fu; Wang, Zheng-hua; Yao, Lu; Liu, Hua-yong

doi:10.1007/s10586-013-0332-1

CPU/GPU computing for a multi-block structured grid based high-order flow solver on a large heterogeneous system

Published: 27 November 2013

Volume 17, pages 255–270, (2014)
Cite this article

Cluster Computing Aims and scope Submit manuscript

Wei Cao¹,
Chuan-fu Xu¹,
Zheng-hua Wang¹,
Lu Yao¹ &
…
Hua-yong Liu²

522 Accesses
12 Citations
Explore all metrics

Abstract

The high-order schemes have attracted more and more attention in computational fluid dynamics (CFD) simulations. As a kind of high-order schemes, weighted compact nonlinear schemes (WCNSs) have been widely applied in large eddy simulations, direct numerical simulations etc. However, due to the computational complexity, WCNSs require high-performance platforms. In recent years, the highly parallel graphics processing unit (GPU) is rapidly gaining maturity as a powerful engine for high performance computer. In this paper, we present a high-order double-precision solver of the three-dimensional, compressible viscous flow using multi-block structured grids on GPU clusters. The solver utilizes the high-order WCNS scheme for space discretization and Jacobi iteration method for time discretization. In order to utilize the computational capability of CPU and GPU for the solver, we present a workload balancing model for distributing workload among CPUs and GPUs. And we design two strategies to overlap computations with communications. The performance analyses show that the single-GPU solver achieves about 8× speed-ups relative to a serial computation on a CPU core. The performance results validate the workload distribution scheme. The strong and weak scaling analyses show that GPU clusters offer a significant advantage in performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Accelerating CFD simulation with high order finite difference method on curvilinear coordinates for modern GPU clusters

Article Open access 08 February 2022

Towards Efficient Decomposition and Parallelization of MPDATA on Hybrid CPU-GPU Cluster

A Geometric Multigrid Solver on GPU Clusters

References

Harten, A.: High resolution schemes for hyperbolic conservation laws. J. Comput. Phys. 49(3), 357–393 (1983)
Article MATH MathSciNet Google Scholar
van Leer, B.: Towards the ultimate conservative difference scheme. V. A second-order sequel to Godunov’s method. J. Comput. Phys. 32(1), 101–136 (1979)
Article Google Scholar
Harten, A., Engquist, B., Osher, S., Chakravarthy, S.R.: Uniformly high order accurate essentially non-oscillatory schemes, III. J. Comput. Phys. 71(2), 231–303 (1987)
Article MATH MathSciNet Google Scholar
Liu, X.-D., Osher, S., Chan, T.: Weighted essentially nonoscillatory schemes. J. Comput. Phys. 115(1), 200–212 (1994)
Article MATH MathSciNet Google Scholar
Lele, S.K.: Compact finite difference schemes with spectral-like resolution. J. Comput. Phys. 103(1), 16–42 (1992)
Article MATH MathSciNet Google Scholar
Deng, X., Zhang, H.: Developing high-order weighted compact nonlinear schemes. J. Comput. Phys. 165(1), 22–44 (2000)
Article MATH MathSciNet Google Scholar
Deng, X., Liu, X., Mao, M., Zhang, H.: Investigation on weighted compact fifth-order nonlinear scheme and applications to complex flow. In: 17th AIAA Computational Fluid Dynamics Conference, June 2005. American Institute of Aeronautics and Astronautics (2005)
Google Scholar
Ishiko, K., Ohnishi, N., Ueno, K., Sawada, K.: Implicit large eddy simulation of two-dimensional homogeneous turbulence using weighted compact nonlinear scheme. J. Fluids Eng. 131(6), 061401 (2009)
Article Google Scholar
Tani, H., Teramoto, S., Yamanishi, N., Okamoto, K.: A numerical study on a temporal mixing layer under transcritical conditions. Comput. Fluids (2012). doi:10.1016/j.compfluid.2012.10.022
Google Scholar
Yang, X.-J., Liao, X.-K., Lu, K., Hu, Q.-F., Song, J.-Q., Su, J.-S.: The Tianhe-1a supercomputer: its hardware and software. J. Comput. Sci. Technol. 26, 344–351 (2011)
Article Google Scholar
Le, H., Cambier, J.L.: Development of a flow solver with complex kinetics on the graphic processing units. ArXiv e-prints (2011)
Tutkun, B., Edis, F.O.: A gpu application for high-order compact finite difference scheme. Comput. Fluids 55(0), 29–35 (2012)
Article MathSciNet Google Scholar
Esfahanian, V., Darian, H.M., Gohari, S.I.: Assessment of Weno schemes for numerical simulation of some hyperbolic equations using GPU. Comput. Fluids (2012). doi:10.1016/j.compfluid.2012.02.031
Google Scholar
Geveler, M., Ribbrock, D., Goddeke, D., Zajac, P., Turek, S.: Towards a complete fem-based simulation toolkit on GPUs: unstructured grid finite element geometric multigrid solvers with strong smoothers based on sparse approximate inverses. Comput. Fluids (2012). doi:10.1016/j.compfluid.2012.01.025
Google Scholar
Jacobsen, D., Thibault, J., Senocak, I.: An MPI-CUDA implementation for massively parallel incompressible flow computations on multi-GPU clusters. In: 48th AIAA Aerospace Sciences Meeting Including the New Horizons Forum and Aerospace Exposition, January 2010. American Institute of Aeronautics and Astronautics (2010)
Google Scholar
Jacobsen, D.A., Senocak, I.: Multi-level parallelism for incompressible flow computations on GPU clusters. Parallel Comput. 39(1), 1–20 (2013)
Article MathSciNet Google Scholar
Han, L., Indinger, T., Hu, X., Adams, N.: Wavelet-based adaptive multi-resolution solver on heterogeneous parallel architecture for computational fluid dynamics. Comput. Sci. Res. Dev. 26, 197–203 (2011)
Article Google Scholar
Antoniou, A., Karantasis, K., Polychronopoulos, E., Ekaterinaris, J.: Acceleration of a finite-difference WENO scheme for large-scale simulations on many-core architectures. In: 48th AIAA Aerospace Sciences Meeting Including the New Horizons Forum and Aerospace Exposition, January 2010. American Institute of Aeronautics and Astronautics (2010)
Google Scholar
Appleyard, J., Drikakis, D.: Higher-order CFD and interface tracking methods on highly-parallel MPI and GPU systems. Comput. Fluids 46(1), 101–105 (2011)
Article MATH MathSciNet Google Scholar
Wang, P., Abel, T., Kaehler, R.: Adaptive mesh fluid simulations on GPU. New Astron. 15(7), 581–589 (2010)
Article Google Scholar
Griebel, M., Zaspel, P.: A multi-GPU accelerated solver for the three-dimensional two-phase incompressible Navier–Stokes equations. Comput. Sci. Res. Dev. 25, 65–73 (2010)
Article Google Scholar
Lu, F., Song, J., Cao, X., Zhu, X.: CPU/GPU computing for long-wave radiation physics on large GPU clusters. Comput. Fluids 41(0), 47–55 (2012)
Google Scholar
Xie, M., Lu, Y., Liu, L., Cao, H., Yang, X.: Implementation and evaluation of network interface and message passing services for Tianhe-1a supercomputer. In: 2011 IEEE 19th Annual Symposium on High Performance Interconnects (HOTI), pp. 78–86 (2011)
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer, National University of Defense Technology, Changsha, 410073, China
Wei Cao, Chuan-fu Xu, Zheng-hua Wang & Lu Yao
State Key Laboratory of Aerodynamics, Mianyang, 621000, China
Hua-yong Liu

Authors

Wei Cao
View author publications
You can also search for this author in PubMed Google Scholar
Chuan-fu Xu
View author publications
You can also search for this author in PubMed Google Scholar
Zheng-hua Wang
View author publications
You can also search for this author in PubMed Google Scholar
Lu Yao
View author publications
You can also search for this author in PubMed Google Scholar
Hua-yong Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wei Cao.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Cao, W., Xu, Cf., Wang, Zh. et al. CPU/GPU computing for a multi-block structured grid based high-order flow solver on a large heterogeneous system. Cluster Comput 17, 255–270 (2014). https://doi.org/10.1007/s10586-013-0332-1

Download citation

Received: 14 February 2013
Revised: 31 August 2013
Accepted: 03 November 2013
Published: 27 November 2013
Issue Date: June 2014
DOI: https://doi.org/10.1007/s10586-013-0332-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

CPU/GPU computing for a multi-block structured grid based high-order flow solver on a large heterogeneous system

Abstract

Access this article

Similar content being viewed by others

Accelerating CFD simulation with high order finite difference method on curvilinear coordinates for modern GPU clusters

Towards Efficient Decomposition and Parallelization of MPDATA on Hybrid CPU-GPU Cluster

A Geometric Multigrid Solver on GPU Clusters

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

CPU/GPU computing for a multi-block structured grid based high-order flow solver on a large heterogeneous system

Abstract

Access this article

Similar content being viewed by others

Accelerating CFD simulation with high order finite difference method on curvilinear coordinates for modern GPU clusters

Towards Efficient Decomposition and Parallelization of MPDATA on Hybrid CPU-GPU Cluster

A Geometric Multigrid Solver on GPU Clusters

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation