Parallelisation study of a three-dimensional environmental flow model
Introduction
Numerical modelling has several advantages in the study of coastal ocean flow processes and events. Chief among these is the reduced cost and ease of deployment of a numerical model compared to field work or other methods of investigation. In addition, it is easier to configure a numerical model to investigate different flow conditions and scenarios. However, with the drive to model more realistic and detailed simulations, the computational demands of numerical solutions increase, due primarily to finer grid resolution and the simulation of a greater number of passive and active tracers. As a result, the practical ability of numerical models to solve real-world problems is constrained. Parallel computing allows faster execution and the ability to perform larger, more detailed simulations than is possible with serial code. The research reported here presents details on the porting of an existing coastal ocean model from serial code to parallel. This work is driven partly by a desire to model larger simulations in greater detail, but also to allow experimentation in more computationally demanding methods of data assimilation to improve the performance of real-time predictive modelling.
The model used for the study, Environmental Fluid Dynamics Code (EFDC), is a widely used, three-dimensional, finite difference, hydrodynamic model (Hamrick, 1992). The parallelisation adopts an efficient domain decomposition approach that theoretically permits deployment on a large cluster of machines; however the fundamental objective of our work centres on real-time simulation capabilities of a given model on a commodity blade system and not optimal scalability on an arbitrarily large system. We were therefore guided by the following requirements considered key to the success of the parallelisation effort and subsequent operation on similar cluster systems:
- 1.
Limited changes to the large number of source files (approximately 50 000 lines of code), to avoid introducing computational errors.
- 2.
Binary regression of the parallel model versus serial simulations, to ensure the simulation runs in parallel exactly as it ran serially. Even a small deviation could mask the presence of an error in the port.
- 3.
Automation of the setup process for a parallel run to allow the originally setup serial models to run properly on the parallel code. This involves automatic generation of source code specific to each parallel run of a model, to avoid manual effort and the introduction of errors.
Several parallel versions of numerical ocean models have already been described in the literature, and they have computational methods also used by other codes in the geosciences. Wang et al. (1997) present elements of the widely used Parallel Ocean Program (POP), while Beare and Stevens (1997) build on the parallelisation of the Modular Ocean Model (MOM). However, the fundamental structure of these models makes them more suitable for global, ocean-scale problems, and they are not as well-suited to the finer scale resolution of coastal water phenomena. A parallelisation study on the Princeton Ocean Model (POM) and the Regional Ocean Modelling System (ROMS) is discussed by Sannino et al. (2001) and Wang et al. (2005) respectively. A common feature of these models is the adoption of a split-explicit formulation of the equations governing vertically averaged transport. This representation permits easier parallelisation since global communication in the horizontal is eliminated. However the maximum computational timestep is constrained by the Courant–Friedrich–Levy restriction (Ezer et al., 2002), as opposed to the greater numerical flexibility provided by implicit approaches (Jin et al., 2000). De Marchis et al. (2012) presents details on a parallel code that adopts finite volume methods for the solution of the fundamental governing equations.
Among all branches of the geosciences, atmospheric modelling was one of the first to use parallel computers due to the intrinsic needs of both weather models that run in real-time, and climate models that operate in time scales of centuries. Coupling with ocean models similarly creates computational demands that benefit from parallel computation. Drake et al. (1993) present details on the parallel version of the NCAR Community Climate Model, CCM2. The parallelisation strategy decomposes the model domain into geographical patches with a message passing library conducting communication between segregated domains. Wolters and Cats (1993) describe the parallelisation strategy included in the HIRLAM model, a state-of-the-art system for weather forecasts up to 48 h, while Fournier et al. (2004) discuss aspects of deploying a spectral element atmospheric model in parallel. Michalakes et al. (1998) describe the parallelisation approach adopted for the widely used Weather Research and Forecast model.
In the following sections, the model is introduced along with a description of the computational schemes used to solve the governing equations. Section 3 discusses the parallelisation strategy adopted with particular emphasis on load balancing of the computation within an irregular coastal waterbody. Section 4 presents the parallel speedup and performance of the amended model; a case study analysis focuses on Galway Bay, on the West Coast of Ireland to enable a realistic assessment of practical gain. The conclusions and a discussion are found in section 5.
Section snippets
Model description
EFDC is a public domain, open source, modelling package for simulating three-dimensional flow, transport and biogeochemical processes in surface water systems. The model is specifically designed to simulate estuaries and subestuarine components (tributaries, marshes, wet and dry littoral margins), and has been applied to a wide range of environmental studies in the Chesapeake Bay region (Shen et al., 1999). It is presently being used by universities, research organisations, governmental
Parallelisation
EFDC is a Fortran 77 code originally designed for deployment on vector computers as opposed to distributed systems. The code was configured to achieve a degree of parallelisation on shared memory processors by directives inserted in the source specific to vectorised architectures. However, the existing vectorisation code is not of benefit for parallelisation on distributed memory systems. For performance comparable to vector systems, scalable cache based processors achieve speedup through
Performance
All performance tests were conducted on a local commodity blade cluster of five nodes. Each compute node had a X5690 hex-core processor, with clock speed of 3.47 GHz and 12 MB of cache; the nodes are connected by a 1 GBit/s Ethernet network. Parallel simulations were configured to deploy on the smallest number of blades possible to minimise unnecessary network communication.
Experiments have been performed on a typical coastal region application, Galway Bay, to investigate the performance of
Discussion and conclusions
This study presents details on the parallelisation of a widely used environmental flow model. Preliminary results demonstrate that considerable speedup can be achieved on a distributed cluster by adopting a pragmatic approach to the parallelisation effort, with a load-balanced domain decomposition based on the underlying numerical algorithms. Note that this study presents details only on the hydrodynamic simulation itself, and not more computationally demanding aspects of a simulation.
References (29)
- et al.
Developments in terrain-following ocean models: intercomparisons of numerical aspects
Ocean Model
(2002) - et al.
Developments in ocean climate modelling
Ocean Model
(2000) - et al.
Parallelization of a hydrological model using the message passing interface
Env. Model. Softw.
(2013) - et al.
Optimisation of a parallel ocean general circulation model
- et al.
A description of a three-dimensional coastal ocean circulation model
Coast. Estuar. Sci.
(1987) - et al.
Wind-and tide-induced currents in the stagnone lagoon (sicily)
Env. Fluid Mech.
(2012) - et al.
Parallel Computational Fluid Dynamics 2005Theory and Applications
(2006) - Drake, J., Flanery, R., Walker, D., Worley, P., Foster, I., Michalakes, J., Stevens, R., Hack, J., Williamson, D.,...
- Fiduccia, C.M., Mattheyses, R.M., 1982. A linear-time heuristic for improving network partitions. In: 19th Conference...
- et al.
The spectral element atmosphere model (SEAM)high-resolution parallel computation and localized resolution of regional dynamics
Mon. Weather Rev.
(2004)
Tracer conservation with an explicit free surface method for z-coordinate ocean models
Mon. Weather Rev.
Cited by (21)
SW2D-GPU: A two-dimensional shallow water model accelerated by GPGPU
2021, Environmental Modelling and SoftwareCitation Excerpt :High performance computing (HPC) and codes suitable for parallel processing are the best alternative for accelerating numerical solutions (Smari et al., 2016). Most codes that approximate the solution to the shallow water equations are developed using sequential processing or parallelization schemes based on Message Passing Interface (MPI) or Open Multi-Processing (OpenMP) compatible with clusters composed of several Central Processing Units (CPU) (O’Donncha et al., 2014, 2019; Anguita et al., 2015; Noh et al., 2018, 2019). Recently, massively parallel devices such as the General Purpose Graphics Processing Unit (GPGPU) have been shown highly efficient to accelerate the solution of shallow water equations and environmental models with application in real world phenomena (Brodtkorb et al., 2010; Ransom and Younis, 2016; Vacondio et al., 2017; Carlotto et al., 2019; Dazzi et al., 2020).
Simulating current-energy converters: SNL-EFDC model development, verification, and parameter estimation
2020, Renewable EnergyCitation Excerpt :Despite increased system and measurement uncertainties in real-world systems, SNL-EFDC has recently been used to optimize array layouts [71]. In addition, the amount of simulation information produced for the low computational expense is notable, especially given that techniques are available to parallelize EFDC [30,31] or to run it on cloud computing systems [72]. Calibrated parameter values control the fit between experimental data and simulations.
Modelling study of the effects of suspended aquaculture installations on tidal stream generation in Cobscook Bay
2017, Renewable EnergyCitation Excerpt :Horizontal diffusion is calculated via the Smagorinsky formula [21]. The code is parallelized using a domain decomposition approach with MPI synchronization between domains [22,23], while research is ongoing on provisioning via a Cloud offering [24,25]. Specifically designed to simulate estuaries and subestuarine components (tributaries, marshes, wet and dry littoral margins), the model has been used in large number of environmental studies in the Chesapeake Bay region [26].
On the Efficiency of Executing Hydro-environmental Models on Cloud
2016, Procedia EngineeringScalable parallel implementation for 3D semi-implicit hydrodynamic models of shallow waters
2015, Environmental Modelling and SoftwareCitation Excerpt :Semi-)implicit approaches (splitting or non-splitting) additionally require all-to-one/one-to-all collective communications. To be more precise, if a parallel PCG solver is used to obtain surface elevation, all-to-all reduction communications (i.e. all-to-one reduction plus one-to-all broadcast communications) are required at each solver iteration (see for example Nesterov, 2010) and, if a sequential PCG solver within the parallel code is used, a couple of all-to-one gather and one-to-all scatter communications are required (see for example Acosta et al., 2010; O'Donncha et al., 2014). A parallel PCG adds both interchange and all-to-one/one-to-all collective communications at each solver iteration independently of the preconditioner used (the CG algorithm and customary preconditioners can be seen, for example, in Golub and Van Loan, 2012; and Saad, 2003), and they cannot be eliminated.
Characterizing observed circulation patterns within a bay using HF radar and numerical model simulations
2015, Journal of Marine Systems