Skip to main content
Log in

An octree-based, cartesian navier–stokes solver for modern cluster architectures

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Adaptive Cartesian mesh approaches have proven useful for multi-scale applications where particular features can be finely resolved within a large solution domain. Traditional patch-based mesh refinement has demonstrated widespread applicability across a range of problems, but can face performance challenges when applied to very large cases with billions of grid points running on large-scale hybrid CPU/GPU architectures. This work investigates an octree-based method combined with traditional finite-difference algorithms specifically designed to execute structured mesh refinement applications efficiently on modern cluster architectures. The primary application of the approach is the solution of helicopter rotor aerodynamics, where it is desirable to resolve time-dependent, fine-scale tip vortices within a solution domain that encompasses the entire helicopter and extends several rotor diameters away. This work demonstrates the performance of the octree construction and balance algorithms to scale to billions of mesh cells. A canonical problem (convecting vortex) and two application problems (helicopter rotor simulations) verify and validate the performance and accuracy of the developed framework, Orchard, on CPU and GPU architectures. Scaling on CPUs and GPUs is demonstrated up to 140 Xeon sockets and 36 V100 GPUS, respectively. The solver on GPUs demonstrates an order-of-magnitude speedup over execution on traditional CPU cluster nodes.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17

Similar content being viewed by others

References

  1. Löhner R (2019) Towards overcoming the LES crisis. Int J Comput Fluid Dyn 33(3):87–97

    Article  MathSciNet  Google Scholar 

  2. Gilmanov A, Sotiropoulos F, Balaras E (2003) A general reconstruction algorithm for simulating flows with complex 3d immersed boundaries on cartesian grids. J Comput Phys 191(2):660–669

    Article  Google Scholar 

  3. Ye T, Mittal R, Udaykumar HS, Shyy W (1999) An accurate cartesian grid method for viscous incompressible flows with complex immersed boundaries. J Comput Phys 156(2):209–240

    Article  MathSciNet  Google Scholar 

  4. Churchfield MJ, Schreck SJ, Martinez LA, Meneveau C, Spalart PR (2017) An advanced actuator line method for wind energy applications and beyond. In: 35th Wind Energy Symposium. American Institute of Aeronautics and Astronautics, Grapevine, Texax

  5. Mittal A, Sreenivas K, Taylor LK, Hereth L (2015) Improvements to the actuator line modeling for wind turbines. In: 33rd Wind Energy Symposium. American Institute of Aeronautics and Astronautics, Kissimmee, Florida

  6. Wissink AM, Jude D, Jayaraman B, Roget B, Lakshminarayan VK, Sitaraman J, Bauer AC, Forsythe JR, Trigg RD (2021) New capabilities in CREATE-AV helios version 11. In: AIAA Scitech 2021 Forum. American Institute of Aeronautics and Astronautics, Nashville, Tennessee

  7. Buning PG, Jespersen DC, Pulliam TH, Chan W, Slotnick JP, Krist S, Renze KJ (2002) Overflow user’s manual. NASA Langley Research Center, Hampton, VA

    Google Scholar 

  8. Sprague MA, Ananthan S, Vijayakumar G, Robinson M (2020) ExaWind: A multifidelity modeling and simulation environment for wind energy. J Phys: Conf Ser 1452:012071

    Google Scholar 

  9. Sharma A, Ananthan S, Sitaraman J, Thomas S, Sprague MA (2021) Overset meshes for incompressible flows: on preserving accuracy of underlying discretizations. J Comput Phys 428:109987

    Article  MathSciNet  Google Scholar 

  10. Kirby AC, Brazell MJ, Yang Z, Roy R, Ahrabi BR, Stoellinger MK, Sitaraman J, Mavriplis DJ (2019) Wind farm simulations using an overset hp-adaptive approach with blade-resolved turbine models. Int J High Perf Comput Appl 33(5):897–923

    Article  Google Scholar 

  11. Berger MJ, Oliger J (1984) Adaptive mesh refinement for hyperbolic partial differential equations. J Comput Phys 53(3):484–512

    Article  MathSciNet  Google Scholar 

  12. Berger MJ, Colella P (1989) Local adaptive mesh refinement for shock hydrodynamics. J Comput Phys 82(1):64–84

    Article  Google Scholar 

  13. Gunney BTN, Anderson RW (2016) Advances in patch-based adaptive mesh refinement scalability. J Parallel Distrib Comput 89:65–84

    Article  Google Scholar 

  14. Dubey A, Almgren A, Bell J, Berzins M, Brandt S, Bryan G, Colella P, Graves D, Lijewski M, Löffler F, O’Shea B, Schnetter E, Straalen BV, Weide K (2014) A survey of high level frameworks in block-structured adaptive mesh refinement packages. J Parallel Distrib Comput 74(12):3217–3227

    Article  Google Scholar 

  15. Strohmaier E, Dongarra J, Simon H, Meuer M (2021) TOP500 List - June 2021. TOP500.org. https://www.top500.org/lists/top500/2021/06/

  16. Sundar H, Sampath RS, Biros G (2008) Bottom-up construction and 2:1 balance refinement of linear octrees in parallel. SIAM J Sci Comput 30(5):2675–2708

    Article  MathSciNet  Google Scholar 

  17. Isaac T, Burstedde C, Ghattas O (2012) Low-cost parallel algorithms for 2:1 octree balance. In: 2012 IEEE 26th International Parallel and Distributed Processing Symposium. IEEE, Shanghai, China

  18. Isaac T, Burstedde C, Wilcox LC, Ghattas O (2015) Recursive algorithms for distributed forests of octrees. SIAM J Sci Comput 37(5):497–531

    Article  MathSciNet  Google Scholar 

  19. Dubey A, Berzins M, Burstedde C, Norman ML, Unat D, Wahib M, Hinsen K, Dubey A (2021) Structured adaptive mesh refinement adaptations to retain performance portability with increasing heterogeneity. Comput Sci Eng 23(5):62–66

    Article  Google Scholar 

  20. MacNeice P, Olson KM, Mobarry C, de Fainchtein R, Packer C (2000) PARAMESH: a parallel adaptive mesh refinement community toolkit. Comput Phys Commun 126(3):330–354

    Article  Google Scholar 

  21. Zhang W, Almgren A, Beckner V, Bell J, Blaschke J, Chan C, Day M, Friesen B, Gott K, Graves D, Katz M, Myers A, Nguyen T, Nonaka A, Rosso M, Williams S, Zingale M (2019) AMReX: a framework for block-structured adaptive mesh refinement. J Open Source Softw 4(37):1370

    Article  Google Scholar 

  22. Adams M, Colella P, Graves D, Johnson J, Keen N, Ligocki T, Martin D, McCorquodale P, Modiano D, Schwartz P, Sternberg T, Straalen BV (2015) Chombo software package for amr applications - design document. Technical Report LBNL(6616E)

  23. Hornung RD, Kohn SR (2002) Managing application complexity in the SAMRAI object-oriented framework. Concurr Comput Pract Exp 14:347–368

    Article  Google Scholar 

  24. Hornung RD, Wissink AM, Kohn SR (2006) Managing complex data and geometry in parallel structured amr applications. Eng Comput 22(3–4):181–195

    Article  Google Scholar 

  25. Burstedde C, Wilcox LC, Ghattas O (2011) p4est: scalable algorithms for parallel adaptive mesh refinement on forests of octrees. SIAM J Sci Comput 33(3):1103–1133

    Article  MathSciNet  Google Scholar 

  26. Hasbestan JJ, Senocak I (2018) Binarized-octree generation for cartesian adaptive mesh refinement around immersed geometries. J Comput Phys 368:179–195

    Article  MathSciNet  Google Scholar 

  27. Tu T, O’Hallaron DR, Ghattas O (2005) Scalable parallel octree meshing for terascale applications. In: SC ’05: Proceedings of the 2005 ACM/IEEE Conference on Supercomputing, pp. 4–4

  28. Péron S, Benoit C (2013) Automatic off-body overset adaptive cartesian mesh method based on an octree approach. J Comput Phys 232(1):153–173

    Article  Google Scholar 

  29. Renaud T, Benoit C, Peron S, Mary I, Alferez N (2019) Validation of an immersed boundary method for compressible flows. In: AIAA Scitech 2019 Forum. American Institute of Aeronautics and Astronautics, San Diego, California

  30. Bauer M, Eibl S, Godenschwager C, Kohl N, Kuron M, Rettinger C, Schornbaum F, Schwarzmeier C, Thönnes D, Köstler H, Rüde U (2021) waLBerla: a block-structured high-performance framework for multiphysics simulations. Comput Math Appl 81:478–501

    Article  MathSciNet  Google Scholar 

  31. Calhoun DA, Burstedde C (2017) Forestclaw: a parallel algorithm for patch-based adaptive mesh refinement on a forest of quadtrees. CoRR arxiv:1703.03116

  32. Egan R, Guittet A, Temprano-Coleto F, Isaac T, Peaudecerf FJ, Landel JR, Luzzatto-Fegiz P, Burstedde C, Gibou F (2021) Direct numerical simulation of incompressible flows on parallel octree grids. J Comput Phys 428:110084

    Article  MathSciNet  Google Scholar 

  33. Blais B, Barbeau L, Bibeau V, Gauvin S, Geitani TE, Golshan S, Kamble R, Mirakhori G, Chaouki J (2020) Lethe: an open-source parallel high-order adaptative CFD solver for incompressible flows. SoftwareX 12:100579

    Article  Google Scholar 

  34. Müller A, Kopera MA, Marras S, Wilcox LC, Isaac T, Giraldo FX (2016) Strong scaling for numerical weather prediction at petascale with the atmospheric model NUMA

  35. Kirby AC, Mavriplis DJ (2020) GPU-accelerated discontinuous Galerkin methods: 30x speedup on 345 billion unknowns

  36. Calhoun DA, Burstedde C (2020) ForestClaw : ghost filling and parallel communication. GitHub. https://p4est.github.io/slides/forestclaw_t3.pdf

  37. Biedron RT, Carlson J-R, Derlaga JM, Gnoffo PA, Hammond DP, Jones WT, Kleb WL, Lee-Rausch EM, Nielsen EJ, Park MA, et al (2020) Fun3d manual: 13.7

  38. Walden A, Nielsen E, Diskin B, Zubair M (2019) A mixed precision multicolor point-implicit solver for unstructured grids on gpus. In: 2019 IEEE/ACM 9th Workshop on Irregular Applications: Architectures and Algorithms (IA3), pp. 23–30

  39. Wissink A, Kamkar S, Pulliam T, Sitaraman J, Sankaran V (2010) Cartesian adaptive mesh refinement for rotorcraft wake resolution. In: 28th AIAA Applied Aerodynamics Conference, p. 4554

  40. Pulliam TH, Steger JL (1980) Implicit finite-difference simulations of three-dimensional compressible flow. AIAA J 18(2):159–167

    Article  Google Scholar 

  41. Beam RM, Warming RF (1976) An implicit finite-difference algorithm for hyperbolic systems in conservation-law form. J Comput Phys 22(1):87–110

    Article  MathSciNet  Google Scholar 

  42. Kennedy C, Carpenter M (2016) Diagonally implicit runge-kutta methods for ordinary differential equations. a review. In: NASA Technical Report. NASA, Langley, Virginia

  43. Yoon S, Jost G, Chang S (2005) Parallelization of gauss-seidel relaxation for real gas flow. In: NAS Technical Report, NAS-05-011

  44. Jude D, Sitaraman J, Lakshminarayan V, Baeder J (2020) An overset generalised minimal residual method for the multi-solver paradigm. Int J Comput Fluid Dyn 34(1):61–74

    Article  MathSciNet  Google Scholar 

  45. Jameson A, Schmidt W. Turkel E (1981) Numerical solution of the euler equations by finite volume methods using runge kutta time stepping schemes. In: 14th Fluid and Plasma Dynamics Conference. American Institute of Aeronautics and Astronautics, Palo Alto, California

  46. Jude DP (2019) Advancing the multi-solver paradigm for overset cfd toward heterogeneous architectures. PhD thesis, University of Maryland College Park

  47. Soni K, Chandar DDJ, Sitaraman J (2012) Development of an overset grid computational fluid dynamics solver on graphical processing units. Comput Fluids 58:1–14

    Article  MathSciNet  Google Scholar 

  48. Pickering BP, Jackson CW, Scogland TRW, Feng W-C, Roy CJ (2015) Directive-based gpu programming for computational fluid dynamics. Comput Fluids 114:242–253

    Article  MathSciNet  Google Scholar 

  49. Jespersen CD (2010) Acceleration of a cfd code with a gpu. Sci Program 18:193–201

    Google Scholar 

  50. Turk G, Levoy M (1993) Stanford Bunny. http://graphics.stanford.edu/data/3Dscanrep/

  51. Wong OD, Watkins AN, Goodman KZ, Crafton J, Forlines A, Goss L, Gregory JW, Juliano TJ (2018) Blade tip pressure measurements using pressure-sensitive paint. J Am Helicopter Soc

  52. Watkins AN, Leighty BD, Lipford WE, Goodman KZ, Crafton J, Gregory JW (2016) Measuring surface pressures on rotor blades using pressure-sensitive paint. AIAA J 54(1):206–215

    Article  Google Scholar 

  53. Overmeyer AD, Martin PB (2017) Measured boundary layer transition and rotor hover performance at model scale. In: 55th AIAA Aerospace Sciences Meeting. American Institute of Aeronautics and Astronautics, Grapevine, Texas

  54. Lakshminarayan VK, Sitaraman J, Wissink AM (2016) Application of strand grid framework to complex rotorcraft simulations. In: 34th AIAA Applied Aerodynamics Conference. American Institute of Aeronautics and Astronautics, Washington, D.C

Download references

Acknowledgements

Presented materials are products of the CREATE-AV Element of the Computational Research and Engineering for Acquisition Tools and Environments (CREATE) Program sponsored by the U.S. Department of Defense HPC Modernization Program Office. Funding and support for the presented research is also provided by the US Army DEVCOM Aviation and Missile Center. Computer resources for some calculations were provided by the DoD HPCMP Frontier Program.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dylan Jude.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Jude, D., Sitaraman, J. & Wissink, A. An octree-based, cartesian navier–stokes solver for modern cluster architectures. J Supercomput 78, 11409–11440 (2022). https://doi.org/10.1007/s11227-022-04324-7

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-022-04324-7

Keywords

Navigation