Elsevier

Parallel Computing

Volume 25, Issue 5, May 1999, Pages 593-611
Parallel Computing

A wavenumber parallel computational code for the numerical integration of the Navier–Stokes equations

https://doi.org/10.1016/S0167-8191(99)00003-4Get rights and content

Abstract

A parallel computational code for the numerical integration of the Navier–Stokes equations has been developed. The system of partial differential equations describing the non-steady flow of a viscous incompressible fluid in three dimensions is considered and applied to the channel flow problem. A mixed spectral-finite difference technique for the numerical integration of the governing equations is devised: Fourier decomposition in both streamwise and spanwise directions and finite differences in the direction orthogonal to the solid walls are used, while a semi-implicit procedure of Runge–Kutta and Crank–Nicolson type is utilised for the advancement in time. A wavenumber parallelism is implemented for the execution of the calculations. Within each time step of integration, the computations are executed in two distinct phases, each phase corresponding to a different way of decomposing the computational domain, vertically and horizontally, respectively; in both phases of the whole calculation process, each portion of the computing domain is handled by a different CPU on a Convex SPP 1200/XA parallel computing system. Results are presented in terms of performance of the calculation procedure with the use of 2,4,6 and 8 processors respectively and are compared with the single-processor performance. Also the accuracy of the parallel algorithm has been tested, by analysing the evolution in time of small amplitude disturbances of the mean flow; a satisfactory agreement with the theoretical solution given by the hydrodynamic stability theory is found, provided that a given number of grid points in the y direction are present.

Introduction

The calculation of turbulent flows has been an important challenge for scientists since many years; several approaches have been devised to the problem, such as the Reynolds Averaged Navier–Stokes equations (RANS), the Large Eddy Simulation (LES) and the Direct Numerical Simulation (DNS) of turbulence 1, 2. DNS is the most demanding approach in terms of computational resources; it consists in the numerical integration of the Navier–Stokes equations without any modeling, with such an accuracy as to possibly resolve the smallest turbulent scales in space and time. Estimates of computer time and memory to perform accurate calculations show that remarkable computational resources are required for the execution of numerical simulations at values of the Reynolds number still relatively low [2].

Contributions in the field of the numerical integration of the Navier–Stokes equations applied to the channel flow problem have been given – among others – by Kim, Moin and Moser [3], Orszag and Kells [4] and Malik, Zang and Hussaini [5]. Ref. [3] describes a direct numerical simulation of the turbulent channel flow at Re=3300 (Reynolds number); a spectral code has been used, based on Fourier decomposition in the streamwise and spanwise directions and Chebyshev polynomial expansion in the normal direction. The time advancement has been carried out by means of a semi-implicit scheme as in Refs. 6, 7 and the results compared with experimental measurements. Fourier–Chebyshev spectral methods have also been used in Ref. [4] for an analysis of the stability of finite amplitude perturbations of the mean flow, and in Ref. [5], in which a preconditioned iterative technique has been applied to solve an implicit formulation of the governing equations.

In the discussion about advantages and disadvantages of spectral methods and finite differences – the most used numerical techniques in geometrically simple computational domains – Malik, Zang and Hussaini [5] found that results with 33 points Chebyshev polynomials in the normal direction were slightly more accurate with respect to a 257 grid points finite difference discretization. Other authors have investigated on the accuracy of high-order fully finite difference schemes. Rai and Moin [8] developed a spatially high-order-accurate upwind-biased finite difference scheme on a staggered grid; they tested their results by monitoring the evolution of small amplitude disturbances in the channel flow and by computing the fully developed turbulent channel flow for comparison with [3]. They found that high-order accurate upwind schemes can yield good estimates of the evolution of flow instabilities, but a minimum number of grid points is required to obtain accurate solutions. A study directed to the comparison of high order formulations with second-order central-difference schemes for the integration of the Navier–Stokes equations has been performed by Tafti [9]. Conservative and non-conservative forms of the convective terms have been discretized using fifth-order accurate upwind-biased approximations and the results have been verified by means of various test cases, included the evolution of small amplitude perturbations as in 5, 8. One of his final conclusions was that high-order finite difference schemes do not add sufficiently greater accuracy to the results, to justify the extra computational effort associated with their use.

With the advent of parallel computers, parallel Navier–Stokes solvers – suitable to be used for the Direct Numerical Simulation of turbulence – started to appear; parallel computational codes for the numerical integration of the unsteady incompressible Navier–Stokes equations have been developed by several authors on both SIMD and MIMD machines. Levit and Jespersen [10] investigated on the performance of a two-dimensional Navier–Stokes solver implemented on a CM-2 machine: they tested the performance of the code against the one obtained on a CRAY-2, reporting satisfactory results by using an explicit-third order Runge–Kutta algorithm for the time advancement. Pelz [11] studied the performance of algorithms of the Fourier pseudospectral method for the numerical solution of the Navier–Stokes equations on a 1024-node hypercube computer, obtaining an efficiency of the 83% for a three-dimensional problem with mesh size 1283. Chen and Shan [12] presented spectral calculations on the Connection Machine-2 with a parallel algorithm for the three-dimensional Navier–Stokes equations, suitable for direct numerical simulation of homogeneous turbulent flows; they implemented a 5123 mesh resolution with periodic boundaries and reported a computational speed 30% faster with respect to corresponding simulations on a CRAY-2, four-processors machine. A spectral technique for the Navier–Stokes equations has also been used by Basu [13], on a three-processors multicomputer. Briscolini [14] implemented a parallel Navier–Stokes solver for homogeneous turbulence on a IBM scalable computing system.

A relatively recent numerical technique, the spectral element technique, is strongly suitable to be both implemented on parallel computers and used for simulations of fluid flows in geometrically complex domains. Fisher, Ho, Karniadakis, Ronquist and Patera [15] presented a high-efficiency medium-grained parallel spectral element method for the numerical solution of the unsteady incompressible Navier–Stokes equations implemented on a Intel Hypercube, and evaluated the optimality of the algorithm-architecture coupling; Floros and Reeve [16] presented an evaluation of a spectral-element Navier–Stokes solver on three different parallel architectures, a network of transputers, an Intel iPSC/860 and a Meiko CS-2; Crawford, Evangelinos, Newman and Karniadakis [17] presented benchmark results of turbulence calculations from the parallel implementation of a three-dimensional Navier–Stokes solver implemented on different platforms, IBM SP2, SGI Power Challenge XL and CRAY C90. The solver is based on a mixed spectral element-Fourier expansion technique, in which the Fourier expansion is used in the homogeneous direction and the spectral element discretization in the plane orthogonal to this direction. Two test cases are considered, a wall bounded flow with rough surface and an external flow past bluff bodies. An analysis of perspectives in the field of parallel numerical simulation of incompressible flows, particularly in complex geometries, can be found in Fischer and Patera [18]. A mixed spectral element, pseudospectral and finite difference scheme for the Navier–Stokes equations has been implemented on a Meiko parallel computer by Prestin and Shtilman [19] for the analysis of jetlike flows; the performance level achieved by the spectral element code on the 28-node Meiko computer approached 200 megaflops in single precision and 150 megaflops in double precision.

Parallel Navier–Stokes solvers for the pipe flow case have been developed by Tal [20] and Briscolini and Fatica [21]. Ref. [20] describes the development of the solver on a cluster of ALPHA workstations; a parallel efficiency around 80% is achieved with different resolutions and the use of up to 10 processors. In Ref. [21] a second-order finite difference Navier–Stokes solver implemented on a IBM SP2, is presented; parallel efficiencies of 80–90% are reported by the authors. A parallel Navier–Stokes solver for the channel flow case has been developed by Garg, Ferziger and Monismith [22]. They considered the case of stratified turbulent flow in a channel, described by the Navier–Stokes equations and the scalar transport equation. They used a mixed technique, Fourier decomposition and finite differences in space, and a semi-implicit Crank–Nicolson and third-order Runge–Kutta scheme, in time. They tested the code on both Intel Paragon and iPSC/860 Hypercube computers; efficiencies ranging from 91% to 60% are reported by the authors.

In the present work, a parallel Navier–Stokes solver, implemented on the channel flow problem, is described; the aim of the authors is to show that the parallel implementation of the mixed spectral-finite difference scheme that has been developed, is a good choice to overcome the major difficulties in performing direct numerical simulations of wall bounded turbulent flows under both the viewpoints of computational resources and accuracy of the calculations. In the following sections the numerical techniques, the characteristics of the parallel computer and the parallelization concepts, are described; parallel performance results are presented for different calculation domains, of N × N × N and N × 2N × N type, respectively, directed to the comparison of the parallel performance in using 2, 4, 6 and 8 processors, with respect to the single-processor performance. The accuracy of the calculations has been analysed in the framework of the linear stability theory, by monitoring the temporal evolution of small amplitude perturbations of the mean flow. The calculations have been executed on the Convex SPP 1200/XA of the CILEA (Consorzio Interuniversitario Lombardo per l'Elaborazione Automatica) of Segrate (Milano).

Section snippets

Mathematical formulation and computational techniques

The system of nonlinear partial differential equations – in non-dimensional, divergence form and index notation (i, k=1, 2, 3) – governing the flow of a viscous incompressible fluid (the Navier–Stokes equations), is considered:tVk+iVkVi=−kp+1ReiiVk,iVi=0,where Re is the Reynolds number. Spatial coordinates and velocity components will be named x, y, z and u, v, w respectively; variables and operators have been nondimensionalized by using the channel half-width H and the steady-state

Computing system

The Convex Exemplar SPP 1200/XA is a massively parallel, scalable MIMD machine; it contains processors HP PA-RISC 7200 (120 MHz) arranged in hypernodes. The hypernode is a symmetric multiprocessor system formed by four blocks of two CPUs each; within each hypernode, processors communicate via a nonblocking crossbar of 1.25 Gbytes/s (peak), while interhypernode communications take place via high-speed scalable coherent interface rings of 600 Mbytes/s, the CTI (Coherent Toroidal Interface) rings.

Parallelization

A critical part of the whole calculation procedure is the evaluation of the Cu, Cv, Cw, terms (5), which include the convective (nonlinear) terms in x, y, z, and the diffusive terms along x and z; numerically (subroutine Cuvw_tot.f), this process is handled by means of a fourth-order Runge–Kutta algorithm (see also , , , ):Cu=16Δt(Cu0+2Cu1+2Cu2+Cu3),Cv=16Δt(Cv0+2Cv1+2Cv2+Cv3),Cw=16Δt(Cw0+2Cw1+2Cw2+Cw3).The evaluation of the nonlinear terms in , , , is performed pseudospectrally, by

Parallel performance with computational domains of N×N×N type

The parallel performance has been first monitored with four different discretizations of N × N × N type (along x, y, z); in particular 163, 323, 643 and 963 grid points have been considered, combined with partitions of 1, 2, 4, 6 and 8 processors, respectively (Re=1000, Δt=0.0025).

In Fig. 3 the nondimensional run-times with the number of CPUs used for each computational domain are reported; the nondimensional run-time T is defined as the run-time per time step (Δt) with a given number of

Concluding remarks

An alternating-two-phase wavenumber parallelism in the calculations has been implemented to perform the numerical integration of the three-dimensional, time dependent, incompressible Navier–Stokes equations on a Convex SPP 1200/XA. The channel flow problem has been considered and both parallel performance and accuracy have been investigated. With respect to parallel performance, two types of computational domains with different numbers of grid points and partitions of 2, 4, 6 and 8 processors

References (27)

  • W.C. Reynolds

    Computation of turbulent flows

    Annu. Rev. Fluid Mech.

    (1976)
  • R.S. Rogallo et al.

    Numerical simulation of turbulent flows

    Annu. Rev. Fluid Mech.

    (1984)
  • J. Kim et al.

    Turbulence statistics in fully developed channel flow at low Reynolds number

    J. Fluid Mech.

    (1987)
  • Cited by (0)

    View full text