Skip to main content
Log in

Towards a heterogeneous architecture solver for the incompressible Navier–Stokes equations

  • Regular Paper
  • Published:
CCF Transactions on High Performance Computing Aims and scope Submit manuscript

Abstract

Large-scale supercomputers equipped with GPUs as accelerators are potential to satisfy the future Exascale computing. In this work the solution of large and sparse linear systems of equations by using the Krylov subspace methods, which is crucial for the overall performance of many industrial and scientific applications, is chosen to be accelerated by GPUs’ greatly enlarged computing power. To fulfill this objective on the target hardware with a large amount of heterogeneous computing nodes, two main contributions are included in this work. First we propose a communication avoiding variant of the BICGStab solution method which reduces the global synchronization points per iteration from 3 in the classical BICGStab method to 1 in the improved variant. The superiority in terms of a reduction of the expensive global communications via all computing processes can be expected on a large-scale distributed memory cluster. Second, to handle the host-to-accelerator data transfers, the main challenge encountered in the usage of heterogeneous architecture, a communication overlapped implementation of the sparse matrix–vector multiplication is proposed since this kernel features heavily in the Krylov subspace methods. Linear systems of equations arising from the incompressible Navier–Stokes equations are used to evaluate the proposed solution and optimization methods. Evaluations of the GPU and CPU implementations are conducted on up to 256 GPUs and 4096 CPU cores, respectively. It is revealed that to obtain the same computation time a two times reduction of the number of computing nodes is achieved by using the GPU implementation on the heterogeneous node equipped with 4 GPUs and a 32-core CPU. This result can be seen as the advantage of the heterogeneous architecture from the view point of applications, which motivates a wide utilization in other related areas.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  • Benzi, M.: Preconditioning techniques for large linear systems: a survey. J. Comput. Phys. 182, 418–477 (2002)

    Article  MathSciNet  Google Scholar 

  • Benzi, M., Golub, G., Liesen, J.: Numerical solution of saddle point problems. Acta Numer. 14(1), 1–137 (2005)

    Article  MathSciNet  Google Scholar 

  • der Vorst, H.V.: Bi-CGSTAB: a fast and smoothly converging variant of Bi-CG for the solution of nonsymmetric linear systems. SIAM J. Sci. Stat. Comput. 13(2), 631–644 (1992)

    Article  MathSciNet  Google Scholar 

  • Elman, H., Silvester, D., Wathen, A.: Finite Elements and Fast Iterative Solvers: with Applications in Incompressible Fluid Dynamics. Oxford University Press, Oxford (2014)

    Book  Google Scholar 

  • Erturk, E.: Discussions on driven cavity flow. Int. J. Numer. Meth. Fluids 60, 275–294 (2009)

    Article  MathSciNet  Google Scholar 

  • Fan, Z., Qiu, F., Kaufman, A., Stover, S.: GPU cluster for high performance computing. In: Proceedings of the 2004 ACM/IEEE Conference on Supercomputing, p. 47. IEEE (2004)

  • Ferziger, J., Peric, M.: Computational Methods for Fluid Dynamics. Springer, Berlin (2012)

    MATH  Google Scholar 

  • Ghia, U., Ghia, K., Shin, C.: High resolutions for incompressible flow using the Navier–Stokes equations and a multigrid method. J. Comput. Phys. 48, 387–411 (1982)

    Article  Google Scholar 

  • Gorobets, A., Trias, F., Borrell, R., Oliva, A.: Direct numerical simulation of turbulent flows with parallel algorithms for various computing. In: 6th European Conference on Computational Fluid Dynamics (ECFD VI), Barcelona, Spain (2014)

  • Gorobets, A., Trias, F., Oliva, A.: A parallel MPI+OpenMP+OpenCL algorithm for hybrid supercomputations of incompressible flows. Comput. Fluids 88, 764–772 (2013)

    Article  MathSciNet  Google Scholar 

  • Liu, X., Smelyanskiy, M., Chow, E., Dubey, P.: Efficient sparse matrix-vector multiplication on x86-based many-core processors. In: Proceedings of the 27th International ACM Conference on International Conference on Supercomputing, ICS 2013, New York, USA, pp. 273–282. ACM (2013)

  • Miller, T., Schmidt, F.: Use of a pressure-weighted interpolation method for the solution of the incompressible Navier–Stokes equations on a nonstaggered grid system. Numer. Heat Transfer Part A Appl. 14(2), 213–233 (1988)

    MATH  Google Scholar 

  • Olshanskii, M., Tyrtyshnikov, E.: Iterative Methods for Linear Systems: Theory and Applications. SIAM, Philadelphia (2014)

    Book  Google Scholar 

  • Patankar, P.: Numerical Heat Transfer and Fluid Flow. McGraw-Hill, New York (1980)

    MATH  Google Scholar 

  • Pestana, J., Wathen, A.: Natural preconditioning and iterative methods for saddle point systems. SIAM Rev. 57(1), 71–91 (2015)

    Article  MathSciNet  Google Scholar 

  • Rinaldi, P., Dari, E., Venere, M., Clausse, A.: A Lattice–Boltzmann solver for 3D fluid simulation on GPU. Simul. Model. Pract. Theory 25, 163–171 (2012)

    Article  Google Scholar 

  • Rossi, R., Mossaiby, F., Idelsohn, S.: A portable OpenCL-based unstructured edge-based finite element Navier–Stokes solver on graphics hardware. Comput. Fluids 81, 134–144 (2013)

    Article  MathSciNet  Google Scholar 

  • Saad, Y.: Iterative Methods for Sparse Linear Systems. SIAM, Philadelphia (2003)

    Book  Google Scholar 

  • Saad, Y., der Vorst, V., Henk, A.: Iterative solution of linear systems in the 20th century. J. Comput. Appl. Math. 123(1), 1–33 (2000)

    Article  MathSciNet  Google Scholar 

  • Soukov, S., Gorobets, A., Bogdanov, P.: Opencl implementation of basic operations for a high-order finite-volume polynomial scheme on unstructured hybrid meshes. Proced. Eng. 61, 76–80 (2013)

    Article  Google Scholar 

  • Wesseling, P.: Principles of Computational Fluid Dynamics. Springer, Berlin (2009)

    MATH  Google Scholar 

  • Yang, L., Brent, R.: The improved BiCGStab method for large and sparse unsymmetric linear systems on parallel distributed memory architectures. In: Proceedings of the Fifth International Conference on Algorithms and Architectures for Parallel Processing, pp. 324–328. IEEE (2002)

Download references

Acknowledgements

We thank to Prof. Shuangling Hu from China Academy of Engineering Physics for his great help on implementing the CFD codes, where the finite volume discreatization kernel is provided by him

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xin He.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, Y., He, X., Yang, S. et al. Towards a heterogeneous architecture solver for the incompressible Navier–Stokes equations. CCF Trans. HPC 2, 123–134 (2020). https://doi.org/10.1007/s42514-020-00034-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s42514-020-00034-9

Keywords

Navigation