skip to main content
10.1145/3468267.3470617acmconferencesArticle/Chapter ViewAbstractPublication PagespascConference Proceedingsconference-collections
research-article

Algorithm-hardware co-design of a discontinuous Galerkin shallow-water model for a dataflow architecture on FPGA

Published:26 August 2021Publication History

ABSTRACT

We present the first FPGA implementation of the full simulation pipeline of a shallow water code based on the discontinuous Galerkin method. Using OpenCL and following an algorithm-hardware codesign approach, the software reference is transformed into a dataflow architecture that can process a full mesh element per clock cycle. The novel projection approach on the algorithmic level complements the pipeline and memory optimizations in the hardware design. With this, the FPGA kernels for different polynomial orders outperform the CPU reference by 43x -- 144x in a strong scaling benchmark scenario. A performance model can explain the measured FPGA performance of up to 717 GFLOPs accurately.

References

  1. V. Aizinger and C. Dawson. 2002. A discontinuous Galerkin method for two-dimensional flow and transport in shallow water. Advances in Water Resources 25, 1 (2002), 67--84. Google ScholarGoogle ScholarCross RefCross Ref
  2. V. Aizinger, J. Proft, C. Dawson, D. Pothina, and S. Negusse. 2013. A three-dimensional discontinuous Galerkin model applied to the baroclinic simulation of Corpus Christi Bay. Ocean Dynamics 63, 1 (2013), 89--113. Google ScholarGoogle ScholarCross RefCross Ref
  3. S. Chippada, C.N. Dawson, M.L. Martinez, and M.F. Wheeler. 1998. A Godunov-type finite volume method for the system of Shallow water equations. Computer Methods in Applied Mechanics and Engineering 151, 1 (1998), 105 -- 129. Google ScholarGoogle ScholarCross RefCross Ref
  4. B. Cockburn and C.-W. Shu. 1989. TVB Runge-Kutta local projection discontinuous Galerkin finite element method for conservation laws. II. General framework. Math. Comp. 52 (1989), 411--435. Google ScholarGoogle ScholarCross RefCross Ref
  5. C. Dawson and V. Aizinger. 2005. A discontinuous Galerkin method for three-dimensional shallow water equations. Journal of Scientific Computing 22, 1-3 (2005), 245--267. Google ScholarGoogle ScholarCross RefCross Ref
  6. S. Faghih-Naini, S. Kuckuk, V. Aizinger, D. Zint, R. Grosso, and H. Köstler. 2020. Quadrature-free discontinuous Galerkin method with code generation features for shallow water equations on automatically generated block-structured meshes. Advances in Water Resources 138 (2020), 103552. Google ScholarGoogle ScholarCross RefCross Ref
  7. P. Gorlani, T. Kenter, and C. Plessl. 2019. OpenCL Implementation of Cannon's Matrix Multiplication Algorithm on Intel Stratix 10 FPGAs. In 2019 International Conference on Field-Programmable Technology (ICFPT). 99--107. Google ScholarGoogle ScholarCross RefCross Ref
  8. H. Hajduk, B. R. Hodges, V. Aizinger, and B. Reuter. 2018. Locally Filtered Transport for computational efficiency in multi-component advection-reaction models. Environmental Modelling & Software 102 (2018), 185--198. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. H. Hajduk, D. Kuzmin, and V. Aizinger. 2020. Bathymetry Reconstruction Using Inverse Shallow Water Models: Finite Element Discretization and Regularization. In Numerical Methods for Flows: FEF 2017 Selected Contributions, H. van Brummelen, A. Corsini, S. Perotto, and G. Rozza (Eds.). Springer International Publishing, Cham, 223--230. Google ScholarGoogle ScholarCross RefCross Ref
  10. M. Hauck, V. Aizinger, F. Frank, H. Hajduk, and A. Rupp. 2020. Enriched Galerkin method for the shallow-water equations. GEM : International Journal on Geomathematics 11, 1 (2020). Google ScholarGoogle ScholarCross RefCross Ref
  11. Intel. 2020. Intel FPGA SDK for OpenCL Pro Edition Best Practices Guide (UGOCL003, Version 20.3). https://www.intel.com/content/dam/altera-www/global/en_US/pdfs/literature/hb/opencl-sdk/aocl-best-practices-guide.pdf.Google ScholarGoogle Scholar
  12. Intel. 2020. Intel FPGA SDK for OpenCL Pro Edition Programming Guide (UG-OCL002, Version 20.3). https://www.intel.com/content/dam/www/programmable/us/en/pdfs/literature/hb/opencl-sdk/aocl_programming_guide.pdf.Google ScholarGoogle Scholar
  13. A. K. Jain, H. Omidian, H. Fraisse, M. Benipal, L. Liu, and D. Gaitonde. 2020. A Domain-Specific Architecture for Accelerating Sparse Matrix Vector Multiplication on FPGAs. In Proc. Int. Conf. on Field Programmable Logic and Applications (FPL). 127--132. Google ScholarGoogle ScholarCross RefCross Ref
  14. T. Kenter, J. Förstner, and C. Plessl. 2017. Flexible FPGA design for FDTD using OpenCL. In Proc. Int. Conf. on Field Programmable Logic and Applications (FPL). IEEE, 1--7. Google ScholarGoogle ScholarCross RefCross Ref
  15. T. Kenter, G. Mahale, S. Alhaddad, Y. Grynko, C. Schmitt, A. Afzal, F. Hannig, J. Förstner, and C. Plessl. 2018. OpenCL-based FPGA Design to Accelerate the Nodal Discontinuous Galerkin Method for Unstructured Meshes. In Proc. IEEE Symp. on Field-Programmable Custom Computing Machines (FCCM). Google ScholarGoogle ScholarCross RefCross Ref
  16. T. De Matteis, J. de Fine Licht, and T. Hoefler. 2019. FBLAS: Streaming Linear Algebra on FPGA. CoRR abs/1907.07929 (2019).Google ScholarGoogle Scholar
  17. A. Modave, A. St-Cyr, and T. Warburton. 2016. GPU performance analysis of a nodal discontinuous Galerkin method for acoustic and elastic models. Computers & Geosciences 91 (2016), 64 -- 76. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. B. Reuter, V. Aizinger, and H. Köstler. 2015. A multi-platform scaling study for an OpenMP parallelization of a discontinuous Galerkin ocean model. Computers and Fluids 117 (2015), 325 -- 335. Google ScholarGoogle ScholarCross RefCross Ref
  19. B. Reuter, H. Hajduk, A. Rupp, F. Frank, V. Aizinger, and P. Knabner. 2020. FESTUNG 1.0: Overview, usage, and example applications of the MATLAB/GNU Octave toolbox for discontinuous Galerkin methods. Computers & Mathematics with Applications (2020). Google ScholarGoogle ScholarCross RefCross Ref
  20. K. Sano, Y. Hatsuda, and S. Yamamoto. 2014. Multi-FPGA Accelerator for Scalable Stencil Computation with Constant Memory Bandwidth. IEEE Transactions on Parallel and Distributed Systems (TPDS) 25, 3 (March 2014), 695--705.Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. M. B. Sharif, S. K. Ghafoor, T. M. Hines, M. Morales-Hernändez, K. J. Evans, S.-C. Kao, A.J. Kalyanapu, T. T. Dullo, and S. Gangrade. 2020. Performance Evaluation of a Two-Dimensional Flood Model on Heterogeneous High-Performance Computing Architectures. In Proc. Platform for Advanced Scientific Computing Conf. (PASC). ACM, Article 8, 9 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. L. C. Stewart, C. Pasoe, B. W. Sherman, M. Herbordt, and V. Sachdeva. 2020. An OpenCL 3D FFT for Molecular Dynamics Simulations on Multiple FPGAs. arXiv preprint arXiv:2009.12617 (2020).Google ScholarGoogle Scholar
  23. J. J. Westerink, K. D. Stolzenbach, and J. J. Connor. 1989. General Spectral Computations of the Nonlinear Shallow Water Tidal Interactions within the Bight of Abaco. Journal of Physical Oceanography 19, 9 (09 1989), 1348--1371. <1348:GSCOTN>2.0.CO;2 Google ScholarGoogle ScholarCross RefCross Ref
  24. C. Yang, T. Geng, T. Wang, R. Patel, Q. Xiong, A. Sanaullah, C. Wu, J. Sheng, C. Lin, V. Sachdeva, W. Sherman, and M. Herbordt. 2019. Fully integrated FPGA molecular dynamics simulations. In Proc. Int. Conf. on High Performance Computing, Networking, Storage and Analysis (SC). 1--31. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. H. R. Zohouri, A. Podobas, and S. Matsuoka. 2018. Combined Spatial and Temporal Blocking for High-Performance Stencil Computation on FPGAs Using OpenCL. In Proc. Int. Symp. on Field-Programmable Gate Arrays (FPGA). ACM, 153--162.Google ScholarGoogle Scholar

Index Terms

  1. Algorithm-hardware co-design of a discontinuous Galerkin shallow-water model for a dataflow architecture on FPGA

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        PASC '21: Proceedings of the Platform for Advanced Scientific Computing Conference
        July 2021
        186 pages
        ISBN:9781450385633
        DOI:10.1145/3468267

        Copyright © 2021 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 26 August 2021

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        PASC '21 Paper Acceptance Rate17of33submissions,52%Overall Acceptance Rate83of185submissions,45%

        Upcoming Conference

        PASC '24
        Platform for Advanced Scientific Computing Conference
        June 3 - 5, 2024
        Zurich , Switzerland

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader