Application of the parallel diagonal dominant algorithm for the incompressible Navier-Stokes equations

https://doi.org/10.1016/j.jcp.2020.109795Get rights and content

Highlights

  • Proved the applicability of the PDD algorithm for the incompressible Navier-Stokes equations.

  • Proved errors from the PDD algorithm are bounded regardless of diagonal dominance of tridiagonal matrices.

  • Investigated effects of the CFL number and the number of grid points on the accuracy.

  • Developed a highly scalable ADI-PDD method by utilizing an aggregative data communication method.

Abstract

The accuracy and the applicability of the parallel diagonal dominant (PDD) algorithm are explored for highly scalable computation of the incompressible Navier-Stokes equations which are integrated using a fully-implicit fractional-step method in parallel computational environments. The PDD algorithm is known to be applicable only for an evenly diagonal dominant matrix. In the present study, however, it is shown mathematically that the PDD algorithm is utilizable even for non-diagonal dominant matrices derived from discretization of incompressible momentum equations. The order of accuracy and the error characteristics are investigated in detail in terms of the Courant-Friedrichs-Lewy (CFL) number and the grid spacing by conducting simulations of decaying vortices in both two and three dimensions, flow in a lid-driven cavity, and flow over a circular cylinder. In order to reduce communication cost, which is one of bottlenecks in parallel computation, an aggregative data communication method is combined with the PDD algorithm. Parallel performance of the present PDD-based method is investigated by measuring the speedup, efficiency, overhead, and serial fraction.

Introduction

Momentum equations of incompressible flow integrated by using implicit schemes are formulated into a linear system with banded matrices for a structured-grid algorithm. For example, in a three-dimensional configuration, a hepta-diagonal matrix system is generated when 3-band centered difference schemes are employed for spatial derivatives. The hepta-diagonal matrix is often factorized into three tridiagonal matrices using an alternating-direction-implicit (ADI) method [1]. The ADI method significantly reduces the computational cost since tridiagonal matrices can be efficiently inverted using the Thomas algorithm. However, the Thomas algorithm is inherently sequential and requires a global transpose technique with a pipelining method for computation of a domain-decomposed problem on a distributed-memory parallel machine. The communication cost for the global transpose is known to be comparable to the cost for matrix inversion since entire data on all the decomposed domains must be communicated to invert tridiagonal matrices.

Due to the difficulty in parallel inversion of tridiagonal matrices, alternatively, a parallel multigrid method can be used to solve a sparse matrix which is obtained by discretizing the momentum equations without ADI factorization [2], [3], [4]. However, multigrid methods generally require an explicit integration or a linearization of the nonlinear convection terms to avoid complicated coarsening procedures which are necessary when matrix components are changed during the iteration.

Algorithms for solving a tridiagonal matrix on a multi-processor machine have been developed for decades based on the Thomas algorithm. Johnsson [5] introduced cyclic reduction (CR) and partition methods for parallel computation of tridiagonal matrix systems. CR type algorithms are designed mainly for shared-memory machines, while partition methods are often used for distributed-memory machines [6]. The partition method was improved by changing the pattern of data communication [7], [8]. However, the partition method requires to solve sub-matrices which are in tridiagonal forms. The sub-matrices are not inverted in parallel since collection and distribution of data are required. Hofhaus and Van de Velde [9] tested the performance of various methods for parallel inversion of tridiagonal matrices in an ADI framework. Methods used in the work are pipelining, recursive doubling, CR, partitioning, divide and conquer, and a modified half-Gauss-Seidel line-relaxation method, while scalability of those methods is reported to be rather limited. Singh and You [10] suggested that a tridiagonal matrix can be recast to be solved independently among partitioned blocks, however, along with an iterative technique, which again increases the computational cost.

Sun et al. [11], [12] proposed a parallel diagonal dominant (PDD) algorithm to achieve highly scalable performance for inversion of a tridiagonal matrix. The algorithm requires to solve sub-matrices similar to the partition method, but with different forms of sub-matrices which are easily invertible in parallel. Fu and Li [13] combined the PDD algorithm with the CR algorithm to solve the incompressible hydrostatic equations. The PDD algorithm was employed to compute matrices produced by the compact scheme treatments of spatial derivative terms in the incompressible Navier-Stokes equations [14] and to solve the Fourier transformed pressure Poisson equation for the Rayleigh-Bénard convection problem [15]. For all the previous applications, the PDD algorithm was utilized for only diagonal dominant matrices since the accuracy of the solution has been analyzed only for evenly diagonal dominant matrices.

So far, to the best of our knowledge, the PDD algorithm has not been utilized for solution of linear systems derived from discretization of the full incompressible Navier-Stokes equations. In the present work, the PDD algorithm is applied to the solution of the incompressible Navier-Stokes equations which are discretized using second-order central-difference schemes and integrated in time using a second-order fully implicit ADI scheme. Detailed mathematical analysis and numerical experiments show that errors of the present numerical method are characterized according to the CFL number and the number of grid points in each decomposed block for parallel computation. In addition, to reduce communication cost among processors and thereby improving parallel scalability of the present numerical method, an aggregative data communication technique [16] is also utilized.

The present paper is organized as follows: in Section 2, numerical methods for solving the incompressible Navier-Stokes equations are described. In Section 3, applications of the PDD algorithm to solve a series of tridiagonal matrices in the ADI form of the discretized governing equations are presented. Error analysis for the present method and discussion of results from numerical experiments are presented in Sections 4 and 5, respectively, followed by concluding remarks in Section 6.

Section snippets

Numerical methods

The incompressible Navier-Stokes equations are given as follows:uit+xjuiuj=pxi+ν2uixjxj,uixi=0, where xi, ui, p, and ν are the Cartesian coordinates, velocity components, pressure, and kinematic viscosity, respectively. Note that the constant density is absorbed in the pressure. In the present study, a collocated-grid scheme on a Cartesian grid is employed. The velocity and the pressure are stored at the center of computational cells.

Numerical methods for solving above equations are

PDD algorithm for parallel computation of an ADI system

(b10c10a20b20c20am0bm0cm0a11b11c11a21b21c21am1bm1)(X10,r+1X20,r+1Xm0,r+1X11,r+1X21,r+1Xm1,r+1)=(d10d20dm0d11d21dm1)

A tridiagonal matrix of the order N can be efficiently inverted using the Thomas algorithm which requires only O(N) operations when a single processor is employed. However, the Thomas algorithm is an inherently sequential algorithm composed of forward elimination and backward substitution. When a computational domain is decomposed into multiple smaller domains

Accuracy of the PDD algorithm

The accuracy of the PDD algorithm for computation of the present ADI system is investigated. Sun et al. [11] analyzed the error of the PDD algorithm when the matrix A is evenly diagonal dominant and signs of ai and ci are equal. They proved that the maximum values of vm and w1 which are introduced in Section 3.1 are bounded as follows:max(|vm|,|w1|)1(m+1)(1+θ/4)m where θ=min(|bi|/max(|ai|,|ci|)2) for i=[1,2,...,m]. θ is the definition of diagonal dominance.

Matrices derived from momentum

Two-dimensional decaying vortices

Numerical simulations of two-dimensional decaying vortices are conducted to verify spatial and temporal orders of accuracy of the present ADI method coupled with the PDD algorithm. Note that the Thomas algorithm is employed to solve the momentum equations on a single-block domain, while for a decomposed multi-block domain, the PDD algorithm is employed for inversion of tridiagonal matrices. The analytical velocity and pressure fields for decaying vortices are expressed as follows:u=cos(πLx)sin

Conclusions

The PDD algorithm has been combined with an ADI method to achieve highly scalable parallel computation of the incompressible Navier-Stokes equations on distributed-memory parallel machines. The PDD algorithm is known for an efficient parallel algorithm for inverting a tridiagonal matrix without global data communication. However, the PDD algorithm has been only employed for a tridiagonal matrix which is evenly diagonal dominant since the error analysis has been conducted only for an evenly

CRediT authorship contribution statement

Hojun Moon: Writing - Original draft preparation, Methodology, Software. Seungpyo Hong: Validation, Investigation. Donghyun You: Conceptualization, Writing - Original draft preparation, Funding Acquisition.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgements

This research has been supported by the National Research Foundation of Korea (NRF) under the Project Number NRF-2019K1A3A1A74107685, the Korea Institute of Energy Technology Evaluation and Planning (KETEP) of the Ministry of Trade, Industry and Energy (MOTIE) of the Republic of Korea (No. 20193020020160), and POSCO under the Project Number 2018Y077.

References (30)

  • Jinmo Lee et al.

    An implicit ghost-cell immersed boundary method for simulations of moving body problems with control of spurious force oscillations

    J. Comput. Phys.

    (2013)
  • Wim M. Van Rees et al.

    A comparison of vortex and pseudo-spectral methods for the simulation of periodic vortical flows at high Reynolds numbers

    J. Comput. Phys.

    (2011)
  • Dimitri J. Mavriplis et al.

    Multigrid solution of the Navier-Stokes equations on triangular meshes

    AIAA J.

    (1990)
  • S. Lennart Johnsson

    Solving tridiagonal systems on ensemble architectures

    SIAM J. Sci. Stat. Comput.

    (1987)
  • H.H. Wang

    A parallel method for tridiagonal equations

    ACM Trans. Math. Softw.

    (1981)
  • Cited by (0)

    View full text