Abstract
Adaptive Cartesian mesh approaches have proven useful for multi-scale applications where particular features can be finely resolved within a large solution domain. Traditional patch-based mesh refinement has demonstrated widespread applicability across a range of problems, but can face performance challenges when applied to very large cases with billions of grid points running on large-scale hybrid CPU/GPU architectures. This work investigates an octree-based method combined with traditional finite-difference algorithms specifically designed to execute structured mesh refinement applications efficiently on modern cluster architectures. The primary application of the approach is the solution of helicopter rotor aerodynamics, where it is desirable to resolve time-dependent, fine-scale tip vortices within a solution domain that encompasses the entire helicopter and extends several rotor diameters away. This work demonstrates the performance of the octree construction and balance algorithms to scale to billions of mesh cells. A canonical problem (convecting vortex) and two application problems (helicopter rotor simulations) verify and validate the performance and accuracy of the developed framework, Orchard, on CPU and GPU architectures. Scaling on CPUs and GPUs is demonstrated up to 140 Xeon sockets and 36 V100 GPUS, respectively. The solver on GPUs demonstrates an order-of-magnitude speedup over execution on traditional CPU cluster nodes.

















Similar content being viewed by others
References
Löhner R (2019) Towards overcoming the LES crisis. Int J Comput Fluid Dyn 33(3):87–97
Gilmanov A, Sotiropoulos F, Balaras E (2003) A general reconstruction algorithm for simulating flows with complex 3d immersed boundaries on cartesian grids. J Comput Phys 191(2):660–669
Ye T, Mittal R, Udaykumar HS, Shyy W (1999) An accurate cartesian grid method for viscous incompressible flows with complex immersed boundaries. J Comput Phys 156(2):209–240
Churchfield MJ, Schreck SJ, Martinez LA, Meneveau C, Spalart PR (2017) An advanced actuator line method for wind energy applications and beyond. In: 35th Wind Energy Symposium. American Institute of Aeronautics and Astronautics, Grapevine, Texax
Mittal A, Sreenivas K, Taylor LK, Hereth L (2015) Improvements to the actuator line modeling for wind turbines. In: 33rd Wind Energy Symposium. American Institute of Aeronautics and Astronautics, Kissimmee, Florida
Wissink AM, Jude D, Jayaraman B, Roget B, Lakshminarayan VK, Sitaraman J, Bauer AC, Forsythe JR, Trigg RD (2021) New capabilities in CREATE-AV helios version 11. In: AIAA Scitech 2021 Forum. American Institute of Aeronautics and Astronautics, Nashville, Tennessee
Buning PG, Jespersen DC, Pulliam TH, Chan W, Slotnick JP, Krist S, Renze KJ (2002) Overflow user’s manual. NASA Langley Research Center, Hampton, VA
Sprague MA, Ananthan S, Vijayakumar G, Robinson M (2020) ExaWind: A multifidelity modeling and simulation environment for wind energy. J Phys: Conf Ser 1452:012071
Sharma A, Ananthan S, Sitaraman J, Thomas S, Sprague MA (2021) Overset meshes for incompressible flows: on preserving accuracy of underlying discretizations. J Comput Phys 428:109987
Kirby AC, Brazell MJ, Yang Z, Roy R, Ahrabi BR, Stoellinger MK, Sitaraman J, Mavriplis DJ (2019) Wind farm simulations using an overset hp-adaptive approach with blade-resolved turbine models. Int J High Perf Comput Appl 33(5):897–923
Berger MJ, Oliger J (1984) Adaptive mesh refinement for hyperbolic partial differential equations. J Comput Phys 53(3):484–512
Berger MJ, Colella P (1989) Local adaptive mesh refinement for shock hydrodynamics. J Comput Phys 82(1):64–84
Gunney BTN, Anderson RW (2016) Advances in patch-based adaptive mesh refinement scalability. J Parallel Distrib Comput 89:65–84
Dubey A, Almgren A, Bell J, Berzins M, Brandt S, Bryan G, Colella P, Graves D, Lijewski M, Löffler F, O’Shea B, Schnetter E, Straalen BV, Weide K (2014) A survey of high level frameworks in block-structured adaptive mesh refinement packages. J Parallel Distrib Comput 74(12):3217–3227
Strohmaier E, Dongarra J, Simon H, Meuer M (2021) TOP500 List - June 2021. TOP500.org. https://www.top500.org/lists/top500/2021/06/
Sundar H, Sampath RS, Biros G (2008) Bottom-up construction and 2:1 balance refinement of linear octrees in parallel. SIAM J Sci Comput 30(5):2675–2708
Isaac T, Burstedde C, Ghattas O (2012) Low-cost parallel algorithms for 2:1 octree balance. In: 2012 IEEE 26th International Parallel and Distributed Processing Symposium. IEEE, Shanghai, China
Isaac T, Burstedde C, Wilcox LC, Ghattas O (2015) Recursive algorithms for distributed forests of octrees. SIAM J Sci Comput 37(5):497–531
Dubey A, Berzins M, Burstedde C, Norman ML, Unat D, Wahib M, Hinsen K, Dubey A (2021) Structured adaptive mesh refinement adaptations to retain performance portability with increasing heterogeneity. Comput Sci Eng 23(5):62–66
MacNeice P, Olson KM, Mobarry C, de Fainchtein R, Packer C (2000) PARAMESH: a parallel adaptive mesh refinement community toolkit. Comput Phys Commun 126(3):330–354
Zhang W, Almgren A, Beckner V, Bell J, Blaschke J, Chan C, Day M, Friesen B, Gott K, Graves D, Katz M, Myers A, Nguyen T, Nonaka A, Rosso M, Williams S, Zingale M (2019) AMReX: a framework for block-structured adaptive mesh refinement. J Open Source Softw 4(37):1370
Adams M, Colella P, Graves D, Johnson J, Keen N, Ligocki T, Martin D, McCorquodale P, Modiano D, Schwartz P, Sternberg T, Straalen BV (2015) Chombo software package for amr applications - design document. Technical Report LBNL(6616E)
Hornung RD, Kohn SR (2002) Managing application complexity in the SAMRAI object-oriented framework. Concurr Comput Pract Exp 14:347–368
Hornung RD, Wissink AM, Kohn SR (2006) Managing complex data and geometry in parallel structured amr applications. Eng Comput 22(3–4):181–195
Burstedde C, Wilcox LC, Ghattas O (2011) p4est: scalable algorithms for parallel adaptive mesh refinement on forests of octrees. SIAM J Sci Comput 33(3):1103–1133
Hasbestan JJ, Senocak I (2018) Binarized-octree generation for cartesian adaptive mesh refinement around immersed geometries. J Comput Phys 368:179–195
Tu T, O’Hallaron DR, Ghattas O (2005) Scalable parallel octree meshing for terascale applications. In: SC ’05: Proceedings of the 2005 ACM/IEEE Conference on Supercomputing, pp. 4–4
Péron S, Benoit C (2013) Automatic off-body overset adaptive cartesian mesh method based on an octree approach. J Comput Phys 232(1):153–173
Renaud T, Benoit C, Peron S, Mary I, Alferez N (2019) Validation of an immersed boundary method for compressible flows. In: AIAA Scitech 2019 Forum. American Institute of Aeronautics and Astronautics, San Diego, California
Bauer M, Eibl S, Godenschwager C, Kohl N, Kuron M, Rettinger C, Schornbaum F, Schwarzmeier C, Thönnes D, Köstler H, Rüde U (2021) waLBerla: a block-structured high-performance framework for multiphysics simulations. Comput Math Appl 81:478–501
Calhoun DA, Burstedde C (2017) Forestclaw: a parallel algorithm for patch-based adaptive mesh refinement on a forest of quadtrees. CoRR arxiv:1703.03116
Egan R, Guittet A, Temprano-Coleto F, Isaac T, Peaudecerf FJ, Landel JR, Luzzatto-Fegiz P, Burstedde C, Gibou F (2021) Direct numerical simulation of incompressible flows on parallel octree grids. J Comput Phys 428:110084
Blais B, Barbeau L, Bibeau V, Gauvin S, Geitani TE, Golshan S, Kamble R, Mirakhori G, Chaouki J (2020) Lethe: an open-source parallel high-order adaptative CFD solver for incompressible flows. SoftwareX 12:100579
Müller A, Kopera MA, Marras S, Wilcox LC, Isaac T, Giraldo FX (2016) Strong scaling for numerical weather prediction at petascale with the atmospheric model NUMA
Kirby AC, Mavriplis DJ (2020) GPU-accelerated discontinuous Galerkin methods: 30x speedup on 345 billion unknowns
Calhoun DA, Burstedde C (2020) ForestClaw : ghost filling and parallel communication. GitHub. https://p4est.github.io/slides/forestclaw_t3.pdf
Biedron RT, Carlson J-R, Derlaga JM, Gnoffo PA, Hammond DP, Jones WT, Kleb WL, Lee-Rausch EM, Nielsen EJ, Park MA, et al (2020) Fun3d manual: 13.7
Walden A, Nielsen E, Diskin B, Zubair M (2019) A mixed precision multicolor point-implicit solver for unstructured grids on gpus. In: 2019 IEEE/ACM 9th Workshop on Irregular Applications: Architectures and Algorithms (IA3), pp. 23–30
Wissink A, Kamkar S, Pulliam T, Sitaraman J, Sankaran V (2010) Cartesian adaptive mesh refinement for rotorcraft wake resolution. In: 28th AIAA Applied Aerodynamics Conference, p. 4554
Pulliam TH, Steger JL (1980) Implicit finite-difference simulations of three-dimensional compressible flow. AIAA J 18(2):159–167
Beam RM, Warming RF (1976) An implicit finite-difference algorithm for hyperbolic systems in conservation-law form. J Comput Phys 22(1):87–110
Kennedy C, Carpenter M (2016) Diagonally implicit runge-kutta methods for ordinary differential equations. a review. In: NASA Technical Report. NASA, Langley, Virginia
Yoon S, Jost G, Chang S (2005) Parallelization of gauss-seidel relaxation for real gas flow. In: NAS Technical Report, NAS-05-011
Jude D, Sitaraman J, Lakshminarayan V, Baeder J (2020) An overset generalised minimal residual method for the multi-solver paradigm. Int J Comput Fluid Dyn 34(1):61–74
Jameson A, Schmidt W. Turkel E (1981) Numerical solution of the euler equations by finite volume methods using runge kutta time stepping schemes. In: 14th Fluid and Plasma Dynamics Conference. American Institute of Aeronautics and Astronautics, Palo Alto, California
Jude DP (2019) Advancing the multi-solver paradigm for overset cfd toward heterogeneous architectures. PhD thesis, University of Maryland College Park
Soni K, Chandar DDJ, Sitaraman J (2012) Development of an overset grid computational fluid dynamics solver on graphical processing units. Comput Fluids 58:1–14
Pickering BP, Jackson CW, Scogland TRW, Feng W-C, Roy CJ (2015) Directive-based gpu programming for computational fluid dynamics. Comput Fluids 114:242–253
Jespersen CD (2010) Acceleration of a cfd code with a gpu. Sci Program 18:193–201
Turk G, Levoy M (1993) Stanford Bunny. http://graphics.stanford.edu/data/3Dscanrep/
Wong OD, Watkins AN, Goodman KZ, Crafton J, Forlines A, Goss L, Gregory JW, Juliano TJ (2018) Blade tip pressure measurements using pressure-sensitive paint. J Am Helicopter Soc
Watkins AN, Leighty BD, Lipford WE, Goodman KZ, Crafton J, Gregory JW (2016) Measuring surface pressures on rotor blades using pressure-sensitive paint. AIAA J 54(1):206–215
Overmeyer AD, Martin PB (2017) Measured boundary layer transition and rotor hover performance at model scale. In: 55th AIAA Aerospace Sciences Meeting. American Institute of Aeronautics and Astronautics, Grapevine, Texas
Lakshminarayan VK, Sitaraman J, Wissink AM (2016) Application of strand grid framework to complex rotorcraft simulations. In: 34th AIAA Applied Aerodynamics Conference. American Institute of Aeronautics and Astronautics, Washington, D.C
Acknowledgements
Presented materials are products of the CREATE-AV Element of the Computational Research and Engineering for Acquisition Tools and Environments (CREATE) Program sponsored by the U.S. Department of Defense HPC Modernization Program Office. Funding and support for the presented research is also provided by the US Army DEVCOM Aviation and Missile Center. Computer resources for some calculations were provided by the DoD HPCMP Frontier Program.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Jude, D., Sitaraman, J. & Wissink, A. An octree-based, cartesian navier–stokes solver for modern cluster architectures. J Supercomput 78, 11409–11440 (2022). https://doi.org/10.1007/s11227-022-04324-7
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-022-04324-7