research-article

Algorithm-hardware co-design of a discontinuous Galerkin shallow-water model for a dataflow architecture on FPGA

Authors:
Tobias Kenter

Paderborn University, Germany

Paderborn University, Germany
View Profile

,
Adesh Shambhu

Paderborn University, Germany

Paderborn University, Germany
View Profile

,
Sara Faghih-Naini

University of Bayreuth, Bayreuth, Germany

University of Bayreuth, Bayreuth, Germany
View Profile

,
Vadym Aizinger

University of Bayreuth, Bayreuth, Germany

University of Bayreuth, Bayreuth, Germany
View Profile

PASC '21: Proceedings of the Platform for Advanced Scientific Computing ConferenceJuly 2021Article No.: 13Pages 1–11https://doi.org/10.1145/3468267.3470617

Published:26 August 2021Publication History

PASC '21: Proceedings of the Platform for Advanced Scientific Computing Conference

Pages 1–11

ABSTRACT

We present the first FPGA implementation of the full simulation pipeline of a shallow water code based on the discontinuous Galerkin method. Using OpenCL and following an algorithm-hardware codesign approach, the software reference is transformed into a dataflow architecture that can process a full mesh element per clock cycle. The novel projection approach on the algorithmic level complements the pipeline and memory optimizations in the hardware design. With this, the FPGA kernels for different polynomial orders outperform the CPU reference by 43x -- 144x in a strong scaling benchmark scenario. A performance model can explain the measured FPGA performance of up to 717 GFLOPs accurately.

References

V. Aizinger and C. Dawson. 2002. A discontinuous Galerkin method for two-dimensional flow and transport in shallow water. Advances in Water Resources 25, 1 (2002), 67--84. Google ScholarCross Ref
V. Aizinger, J. Proft, C. Dawson, D. Pothina, and S. Negusse. 2013. A three-dimensional discontinuous Galerkin model applied to the baroclinic simulation of Corpus Christi Bay. Ocean Dynamics 63, 1 (2013), 89--113. Google ScholarCross Ref
S. Chippada, C.N. Dawson, M.L. Martinez, and M.F. Wheeler. 1998. A Godunov-type finite volume method for the system of Shallow water equations. Computer Methods in Applied Mechanics and Engineering 151, 1 (1998), 105 -- 129. Google ScholarCross Ref
B. Cockburn and C.-W. Shu. 1989. TVB Runge-Kutta local projection discontinuous Galerkin finite element method for conservation laws. II. General framework. Math. Comp. 52 (1989), 411--435. Google ScholarCross Ref
C. Dawson and V. Aizinger. 2005. A discontinuous Galerkin method for three-dimensional shallow water equations. Journal of Scientific Computing 22, 1-3 (2005), 245--267. Google ScholarCross Ref
S. Faghih-Naini, S. Kuckuk, V. Aizinger, D. Zint, R. Grosso, and H. Köstler. 2020. Quadrature-free discontinuous Galerkin method with code generation features for shallow water equations on automatically generated block-structured meshes. Advances in Water Resources 138 (2020), 103552. Google ScholarCross Ref
P. Gorlani, T. Kenter, and C. Plessl. 2019. OpenCL Implementation of Cannon's Matrix Multiplication Algorithm on Intel Stratix 10 FPGAs. In 2019 International Conference on Field-Programmable Technology (ICFPT). 99--107. Google ScholarCross Ref
H. Hajduk, B. R. Hodges, V. Aizinger, and B. Reuter. 2018. Locally Filtered Transport for computational efficiency in multi-component advection-reaction models. Environmental Modelling & Software 102 (2018), 185--198. Google ScholarDigital Library
H. Hajduk, D. Kuzmin, and V. Aizinger. 2020. Bathymetry Reconstruction Using Inverse Shallow Water Models: Finite Element Discretization and Regularization. In Numerical Methods for Flows: FEF 2017 Selected Contributions, H. van Brummelen, A. Corsini, S. Perotto, and G. Rozza (Eds.). Springer International Publishing, Cham, 223--230. Google ScholarCross Ref
M. Hauck, V. Aizinger, F. Frank, H. Hajduk, and A. Rupp. 2020. Enriched Galerkin method for the shallow-water equations. GEM : International Journal on Geomathematics 11, 1 (2020). Google ScholarCross Ref
Intel. 2020. Intel FPGA SDK for OpenCL Pro Edition Best Practices Guide (UGOCL003, Version 20.3). https://www.intel.com/content/dam/altera-www/global/en_US/pdfs/literature/hb/opencl-sdk/aocl-best-practices-guide.pdf.Google Scholar
Intel. 2020. Intel FPGA SDK for OpenCL Pro Edition Programming Guide (UG-OCL002, Version 20.3). https://www.intel.com/content/dam/www/programmable/us/en/pdfs/literature/hb/opencl-sdk/aocl_programming_guide.pdf.Google Scholar
A. K. Jain, H. Omidian, H. Fraisse, M. Benipal, L. Liu, and D. Gaitonde. 2020. A Domain-Specific Architecture for Accelerating Sparse Matrix Vector Multiplication on FPGAs. In Proc. Int. Conf. on Field Programmable Logic and Applications (FPL). 127--132. Google ScholarCross Ref
T. Kenter, J. Förstner, and C. Plessl. 2017. Flexible FPGA design for FDTD using OpenCL. In Proc. Int. Conf. on Field Programmable Logic and Applications (FPL). IEEE, 1--7. Google ScholarCross Ref
T. Kenter, G. Mahale, S. Alhaddad, Y. Grynko, C. Schmitt, A. Afzal, F. Hannig, J. Förstner, and C. Plessl. 2018. OpenCL-based FPGA Design to Accelerate the Nodal Discontinuous Galerkin Method for Unstructured Meshes. In Proc. IEEE Symp. on Field-Programmable Custom Computing Machines (FCCM). Google ScholarCross Ref
T. De Matteis, J. de Fine Licht, and T. Hoefler. 2019. FBLAS: Streaming Linear Algebra on FPGA. CoRR abs/1907.07929 (2019).Google Scholar
A. Modave, A. St-Cyr, and T. Warburton. 2016. GPU performance analysis of a nodal discontinuous Galerkin method for acoustic and elastic models. Computers & Geosciences 91 (2016), 64 -- 76. Google ScholarDigital Library
B. Reuter, V. Aizinger, and H. Köstler. 2015. A multi-platform scaling study for an OpenMP parallelization of a discontinuous Galerkin ocean model. Computers and Fluids 117 (2015), 325 -- 335. Google ScholarCross Ref
B. Reuter, H. Hajduk, A. Rupp, F. Frank, V. Aizinger, and P. Knabner. 2020. FESTUNG 1.0: Overview, usage, and example applications of the MATLAB/GNU Octave toolbox for discontinuous Galerkin methods. Computers & Mathematics with Applications (2020). Google ScholarCross Ref
K. Sano, Y. Hatsuda, and S. Yamamoto. 2014. Multi-FPGA Accelerator for Scalable Stencil Computation with Constant Memory Bandwidth. IEEE Transactions on Parallel and Distributed Systems (TPDS) 25, 3 (March 2014), 695--705.Google ScholarDigital Library
M. B. Sharif, S. K. Ghafoor, T. M. Hines, M. Morales-Hernändez, K. J. Evans, S.-C. Kao, A.J. Kalyanapu, T. T. Dullo, and S. Gangrade. 2020. Performance Evaluation of a Two-Dimensional Flood Model on Heterogeneous High-Performance Computing Architectures. In Proc. Platform for Advanced Scientific Computing Conf. (PASC). ACM, Article 8, 9 pages. Google ScholarDigital Library
L. C. Stewart, C. Pasoe, B. W. Sherman, M. Herbordt, and V. Sachdeva. 2020. An OpenCL 3D FFT for Molecular Dynamics Simulations on Multiple FPGAs. arXiv preprint arXiv:2009.12617 (2020).Google Scholar
J. J. Westerink, K. D. Stolzenbach, and J. J. Connor. 1989. General Spectral Computations of the Nonlinear Shallow Water Tidal Interactions within the Bight of Abaco. Journal of Physical Oceanography 19, 9 (09 1989), 1348--1371. <1348:GSCOTN>2.0.CO;2 Google ScholarCross Ref
C. Yang, T. Geng, T. Wang, R. Patel, Q. Xiong, A. Sanaullah, C. Wu, J. Sheng, C. Lin, V. Sachdeva, W. Sherman, and M. Herbordt. 2019. Fully integrated FPGA molecular dynamics simulations. In Proc. Int. Conf. on High Performance Computing, Networking, Storage and Analysis (SC). 1--31. Google ScholarDigital Library
H. R. Zohouri, A. Podobas, and S. Matsuoka. 2018. Combined Spatial and Temporal Blocking for High-Performance Stencil Computation on FPGAs Using OpenCL. In Proc. Int. Symp. on Field-Programmable Gate Arrays (FPGA). ACM, 153--162.Google Scholar

Index Terms

Algorithm-hardware co-design of a discontinuous Galerkin shallow-water model for a dataflow architecture on FPGA
1. Applied computing
  1. Physical sciences and engineering
    1. Earth and atmospheric sciences
      1. Environmental sciences
2. Computer systems organization
  1. Architectures
    1. Other architectures
      1. Reconfigurable computing

Recommendations

Scalable Multi-FPGA Design of a Discontinuous Galerkin Shallow-Water Model on Unstructured Meshes
PASC '23: Proceedings of the Platform for Advanced Scientific Computing Conference

FPGAs are fostering interest as energy-efficient accelerators for scientific simulations, including for methods operating on unstructured meshes. Considering the potential impact on high-performance computing, specific attention needs to be given to ...
Read More
Shallow Water DG Simulations on FPGAs: Design and Comparison of a Novel Code Generation Pipeline
High Performance Computing
Abstract
FPGAs are receiving increased attention as a promising architecture for accelerators in HPC systems. Evolving and maturing development tools based on high-level synthesis promise productivity improvements for this technology. However, up to now, ...
Read More
Hardware accelerated FPGA placement

A key advantage of field-programmable gate arrays (FPGAs) over full-custom and semi-custom devices is that they provide relatively quick implementation from concept to physical realization. However, as modern FPGAs reach close to one million logic ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
PASC '21: Proceedings of the Platform for Advanced Scientific Computing Conference
July 2021
186 pages
ISBN:9781450385633
DOI:10.1145/3468267
Conference Chair:
Timothy Robinson
ETH Zurich / CSCS
Copyright © 2021 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 26 August 2021
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
FPGA
OpenCL
dataflow
discontinuous Galerkin method
reconfigurable computing
shallow-water simulation
Qualifiers
- research-article
Conference

Acceptance Rates
PASC '21 Paper Acceptance Rate17of33submissions,52%Overall Acceptance Rate83of185submissions,45%
More
Upcoming Conference
PASC '24

Sponsor:

sighpc

Platform for Advanced Scientific Computing Conference

June 3 - 5, 2024

Zurich , Switzerland
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 5
  Total Citations
  View Citations
- 155
  Total Downloads
- Downloads (Last 12 months)30
- Downloads (Last 6 weeks)2
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Algorithm-hardware co-design of a discontinuous Galerkin shallow-water model for a dataflow architecture on FPGA

PASC '21: Proceedings of the Platform for Advanced Scientific Computing Conference

ABSTRACT

References

Cited By

Index Terms

Recommendations

Scalable Multi-FPGA Design of a Discontinuous Galerkin Shallow-Water Model on Unstructured Meshes

Shallow Water DG Simulations on FPGAs: Design and Comparison of a Novel Code Generation Pipeline

Hardware accelerated FPGA placement

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Algorithm-hardware co-design of a discontinuous Galerkin shallow-water model for a dataflow architecture on FPGA

PASC '21: Proceedings of the Platform for Advanced Scientific Computing Conference

ABSTRACT

References

Cited By

Index Terms

Recommendations

Scalable Multi-FPGA Design of a Discontinuous Galerkin Shallow-Water Model on Unstructured Meshes

Shallow Water DG Simulations on FPGAs: Design and Comparison of a Novel Code Generation Pipeline

Hardware accelerated FPGA placement

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media