FPGA-Based Computational Fluid Dynamics Simulation Architecture via High-Level Synthesis Design Method

Du, Changdao; Firmansyah, Iman; Yamaguchi, Yoshiki

doi:10.1007/978-3-030-44534-8_18

Changdao Du¹³,
Iman Firmansyah¹³ &
Yoshiki Yamaguchi¹³

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12083))

Included in the following conference series:

International Symposium on Applied Reconfigurable Computing

1656 Accesses
5 Citations

Abstract

Today’s High-Performance Computing (HPC) systems often use GPUs as dedicated hardware accelerators to meet the computation requirements of applications such as neural networks, genetic decoding, and hydrodynamic simulations. Meanwhile, FPGAs have also been considered as alternative suitable hardware accelerators due to their advancing computational capabilities and low power consumption. Moreover, the developments of High-Level Synthesis (HLS) allow users to generate FPGA designs directly from mainstream languages, e.g., C, C++, and OpenCL. However, writing efficient high-level programs with good performance is still a time-consuming task, and the lack of knowledge about FPGA architecture can lead to poor scalability and portability. In this paper, we propose an architecture design for Computational Fluid Dynamics (CFD) simulations based on the HLS method. Our design can adjust the performance by utilizing the parallelism inside both temporal and spatial domains of CFD simulations. We also discuss the data reuse buffer optimization choices while considering the potability of HLS codes. A performance model is introduced to guide the design space exploration under the constraints of available resources on FPGA. We evaluate our design via a Xilinx VCU1525 FPGA board and compare the results with other state-of-the-art studies. Experiment results show that VCU1525 can achieve 629.6 GFLOP/s in D2Q9 LBM-BGK model and the design and optimization methods can be used for developing various CFD applications.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Valero-Lara, P., Pinelli, A., Prieto-Matias, M.: Fast finite difference poisson solvers on heterogeneous architectures. Comput. Phys. Commun. 185(4), 1265–1272 (2014)
Article MathSciNet Google Scholar
Feichtinger, C., et al.: Performance modeling and analysis of heterogeneous lattice Boltzmann simulations on CPUGPU clusters. Parallel Comput. 46, 1–13 (2015)
Article MathSciNet Google Scholar
Sano, K., Hatsuda, Y., Yamamoto, S.: Multi-FPGA accelerator for scalable stencil computation with constant memory-bandwidth. IEEE Trans. Parallel Distrib. Syst. 25(3), 695–705 (2014)
Article Google Scholar
Lewis, D., et al.: The stratix 10 highly pipelined FPGA architecture. In: International Symposium on Field-Programmable Gate Arrays (FPGA), pp. 159–168. ACM (2016)
Google Scholar
Cong, J., Liu, B., Neuendorffer, S., et al.: High-level synthesis for FPGAs: from prototyping to deployment. IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst. 30(4), 473–491 (2011)
Article Google Scholar
Canis, A., et al.: LegUp: high-level synthesis for FPGA-based processor/accelerator systems. In: The 19th ACM/SIGDA International Symposium on Field Programmable Gate Arrays (FPGA), pp. 33–36. ACM (2011)
Google Scholar
Chen, S., Doolen, G.D.: Lattice Boltzmann method for fluid flows. Annu. Rev. Fluid Mech. 30(1), 329–364 (1998)
Article MathSciNet Google Scholar
Amati, G., Succi, S., et al.: Massively parallel lattice-Boltzmann simulation of turbulent channel flow. Int. J. Mod. Phys. C 8(4), 869–877 (1997)
Article Google Scholar
Pohl, T., et al.: Performance evaluation of parallel large-scale lattice Boltzmann applications on three supercomputing architectures. In: The 2004 ACM/IEEE Conference on Supercomputing (SC), p. 21. IEEE (2004)
Google Scholar
Pan, C., Luo, L.-S., et al.: An evaluation of lattice Boltzmann schemes for porous medium flow simulation. Comput. Fluids 35(8), 898–909 (2006)
Article Google Scholar
Obrecht, C., Kuznik, F., Tourancheau, B., Roux, J.-J.: Global memory access modelling for efficient implementation of the lattice Boltzmann method on graphics processing units. In: Palma, J.M.L.M., Daydé, M., Marques, O., Lopes, J.C. (eds.) VECPAR 2010. LNCS, vol. 6449, pp. 151–161. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-19328-6_16
Chapter MATH Google Scholar
Delbosc, N., et al.: Optimized implementation of the Lattice Boltzmann Method on a graphics processing unit towards real-time fluid simulation. Comput. Math. Appl. 67(2), 462–475 (2014)
Article MathSciNet Google Scholar
Wang, Z., et al.: GPU acceleration of volumetric lattice Boltzmann method for patient-specific computational hemodynamics. Comput. Fluids 1(15), 192–200 (2015)
Article MathSciNet Google Scholar
Murtaza, S., Hoekstra, A.G., Sloot, P.M.A.: Cellular automata simulations on a FPGA cluster. Int. J. High Perform. Comput. Appl. 25(2), 193–204 (2011)
Article Google Scholar
Sano, K., Yamamoto, S.: FPGA-based scalable and power-efficient fluid simulation using floating-point DSP blocks. IEEE Trans. Parallel Distrib. Syst. 28(10), 2823–2837 (2017)
Article Google Scholar
Waidyasooriya, H.M., et al.: OpenCL-based FPGA-platform for stencil computation and its optimization methodology. IEEE Trans. Parallel Distrib. Syst. 28(5), 1390–1402 (2017)
Article Google Scholar
Zohouri, H.R., Podobas, A., Matsuoka, S.: Combined spatial and temporal blocking for high-performance stencil computation on FPGAs using OpenCL. In: The 2018 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, pp. 153–162. ACM (2018)
Google Scholar
RAM-Based Shift Register (ALTSHIFT\(\_\)TAPS) IP Core. https://www.intel.com/content/.../ug_shift_register_ram_based.pdf
The Xilinx LogiCORE IP RAM-based Shift Register. https://www.xilinx.com/support/.../shift_ram/v12_0/pg122-c-shift-ram.pdf
Wittmann, M., et al.: Comparison of different propagation steps for lattice Boltzmann methods. Comput. Math. Appl. 65(6), 924–935 (2013)
Article MathSciNet Google Scholar
Tomczak, T., Szafran, R.G.: Sparse geometries handling in lattice Boltzmann method implementation for graphic processors. IEEE Trans. Parallel Distrib. Syst. 29(8), 1865–1878 (2018)
Article Google Scholar

Download references

Acknowledgments

This work was supported in part by MEXT as Next Generation High-Performance Computing Infrastructures and Applications R&D Program (Development of Computing-Communication Unified Supercomputer in Next Generation), and by JSPS KAKENHI Grant Number JP17H01707 and JP18H03246. The authors would also like to thank Xilinx Inc., for providing FPGA software tools by Xilinx University Program.

Author information

Authors and Affiliations

University of Tsukuba, Tsukuba, Ibaraki, 305-8577, Japan
Changdao Du, Iman Firmansyah & Yoshiki Yamaguchi

Authors

Changdao Du
View author publications
You can also search for this author in PubMed Google Scholar
Iman Firmansyah
View author publications
You can also search for this author in PubMed Google Scholar
Yoshiki Yamaguchi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Changdao Du .

Editor information

Editors and Affiliations

Technology and Information Systems, University of Castilla-La Mancha, Ciudad Real, Spain
Fernando Rincón
Technology and Information Systems, University of Castilla-La Mancha, Ciudad Real, Spain
Jesús Barba
Department of Electrical and Electronic Engineering, University of Hong Kong, Hong Kong, China
Hayden K. H. So
INESC-ID, Lisbon, Portugal
Pedro Diniz
Technology and Information Systems, University of Castilla-La Mancha, Ciudad Real, Spain
Julián Caba

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Du, C., Firmansyah, I., Yamaguchi, Y. (2020). FPGA-Based Computational Fluid Dynamics Simulation Architecture via High-Level Synthesis Design Method. In: Rincón, F., Barba, J., So, H., Diniz, P., Caba, J. (eds) Applied Reconfigurable Computing. Architectures, Tools, and Applications. ARC 2020. Lecture Notes in Computer Science(), vol 12083. Springer, Cham. https://doi.org/10.1007/978-3-030-44534-8_18

Download citation

DOI: https://doi.org/10.1007/978-3-030-44534-8_18
Published: 25 March 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-44533-1
Online ISBN: 978-3-030-44534-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics